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Introductory  Remarks 


Col  James  A.  Turner 
USAF  Occupational  Measurement  Center 

Coed  Morning.  Welcome  tc  San  Antonio,  one  of  America's  four  unique 
citiee,  ho««  of  the  Air  Force  Hunan  Resource*  Laboratory,  the  Occupational 
Measurement  Center,  and  Alamo  Chapter  U5AP.  It  la  our  pleasure  to  extend 
a  welcome  to  our  keynote  speaker.  General  feianuel,  and  to  our  distinguished 
service  representatives  of  the  Canadian,  Australian,  and  German  Forcas 
and  the  U.S.  Army,  Navy,  Coast  Guard,  and  Air  Force.  On  behalf  of  all 
the  staff  of  the  Human  Resources  Lab,  who  led  the  way;  the  Occupational 
Measurement  Center,  Uio  helped;  and  particularly  from  Col  Dan  G.  Fulgham, 
Commander  of  AFHRL  (who  had  to  be  in  Washington  today),  I  give  you  all 
greetings  and  welcome.  We  hope  you  enjoy  your  stay  here  in  San  Antonio. 

We  open  this  conference  at  a  time  when  there  la  considerable  hue  and  cry 
and  controversy  over  testing,  particularly  intelligence  testing,  and  the 
use  of  scores  In  educational  placoswnt.  It  la  appropriate  for  us  to  take 
a  frequent  look  at  what  we  do  with  testing  and  the  fairness  of  the  instru¬ 
ments  we  use.  Our  Interest  cannot  be  limited  to  testing  alone  since 
building  good  testa  goes  well  beyond  Just  the  construction  of  fair  and 
equitable  examinations.  We  must  know  the  tasks,  the  akillu,  the  knowledges 
required  If  the  lndlvidlal  la  to  perform  satisfactorily .  Once  we  know 
these  things,  then  one  can  proceed  with  how  to  train  the  individual,  and 
we  can  determine  the  areas  which  are  most  realistically  tested  to  measure 
the  person's  promotion  or  other  potential.  In  short,  testing  Is  *  major 
component  in  development  and  maintenance  of  the  Air  Force  Personnel  System. 

Our  guest  of  honor  has  had  a  good  deal  to  do  with  the  AF  system  and  is  well 
qualified  to  keynote  this  19th  HTA  conference.  General  Herbert  L.  Emanuel, 
USAF,  l«  Deputy  ro  the  Assistant  Deputy  Chief  of  Staff,  Personnel  for 
Military  Personnel,  and  Vice  Commander  of  the  Air  Force  Military  Personnel 
Center  at  Randolph  Air  Force  Base,  Texas.  He  has  a  Baccalaureate  In 
Communicative  Arts,  a  Master’s  Degree  in  Personnel  Management,  and  is  a 
graduate  of  the  Armed  Forces  Staff  College  and  the  Industrial  College  of 
the  Armed  Forces.  General  Emanuel  entered  the  Air  Force  as  a  2nd  Lt  via 
the  ROTC  program  in  May  1932  and  served  three  successive  assignments  as  a 
personnel  officer.  After  many  enriching  and  broadening  assignments  world¬ 
wide,  including  a  stint  as  an  Information  Officer,  Director  of  Cadet 
Activities  at  the  Air  Force  Academy,  and  a  tour  in  Vietnam,  he  was  assigned 
to  HQ  USAF  as  a  Personnel  Planner.  He  wss  involved  in  the  initial  publica¬ 
tion  of  the  USAif’  Personnel  Plan  and  developsient  of  the  AF  Personnel 
Management  by  Objectives  System,  tdtlch  still  guides  Air  Force  personnel 
activities.  After  coexisting  the  Industrial  College  of  the  Armed  Forces, 
he  returned  to  the  Headquarters .  As  Deputy  Director  of  Personnel  for  Plans 
and  Policy,  he  was  instrumental  in  the  design  and  implementation  of  many 
initiatives  calculated  to  foster  morale  and  force  discipline  in  the  austere 
personnel  environment  of  the  1970's.  He  asstsmed  his  present  position  in 
March  1976.  It  is  with  great  pleasure  1  introduce  to  you  General  Herb  Emanuel. 
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FUTURE  TRENDS  IN  MILITARY  PERSONNEL 
TESTING:  SOME  SPECULATIONS 

BY 

BRIG  GEN  H.  L.  EMANUEL 
AIR  FORCE  MILITARY  PERSONNEL  CENTER 


Distinguished  Guests,  Testing  Researchers,  Testing 
Practitioners,  Personnel  Managers,  Members  of  the  Mili¬ 
tary  Testing  Association. 

It  is  certainly  a  pleasure  and  a  privilege  to  speak 
to  you  today.  I  was  particularly  pleased  to  learn  that 
your  membership  includes  representatives  from  the  Canadian 
Armed  Forces  and  The  Royal  Australian  Air  Force,  in  addi¬ 
tion  to  representatives  from  the  US  military  services, 
academia,  and  the  business  world.  The  important  work 
which  you  do  should  certainly  be  enhanced  by  the  spirit 
of  international  cooperation  which  you  enjoy. 

Within  the  military,  we  are  confronted  with  the  same 
personnel  probloms  as  any  other  organization,  whether 
large  or  small,  public  or  private — that  of  shaping  and 
adapting  available  human  resources  into  useful  and  effec¬ 
tive  manpower.  The  very  multiplicity  of  skills  required 
by  the  military  poses  problems  in  personnel,  training, 
and  manpower  utilization  which  are  unprecedented.  Per¬ 
sonnel  requirements  change  rapidly  and  on  a  large  scale, 
and  are  dependent  to  a  large  extent  upon  technological 
advances  and  the  international  political  situation. 

Obviously,  military  personnel  management  is  a  highly 
complex  affair.  As  you  know,  to  cope  with  these  complex¬ 
ities  requires  creative  and  innovative  personnel  research — 
research  which  addresses  all  aspects  of  the  personnel 
life  cycle:  Selection,  classification,  training,  perform¬ 
ance  appraisal,  promotion,  and  organizational  development. 
All  of  these  involve  test  instruments  of  some  kind.  Such 
topics  are  of  great  interest  to  us  in  the  personnel  manage¬ 
ment  business — an  interest  engendered  from  two  basic 
sources.  First,  we  are  users  of  your  product.  Our  effec¬ 
tiveness  as  personnel  managers  hinges  on  the  successful 
application  of  techniques  and  procedures  developed  from 
past  personnel  research.  Second ,  we  are  sponsors  of  your 
research.  In  that  role,  we  serve  as  the  liaison  agency 
between  you  and  military  functional  managers  outside  the 
personnel  community — encouraging,  explaining,  and  extolling 
the  virtues  of  research  and  its  applications. 
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Thus,  we  have  a  very  close  and  empethetic  relation¬ 
ship  with  personnel  research  scientists.  We  depend  on 
you  for  timely  and  efficient  solutions  to  management 
problems  as  well  as  for  input  into  the  formulation  of 
personnel  policy.  You,  in  turn,  depend  on  us  as  sort  of 
public  relations  experts  who  insure  your  various  efforts 
are  understood  and  appreciated  not  only  across  the  mili¬ 
tary  rank  and  file,  but  at  the  highest  echelons  of  service 
and  defense  management  as  well. 

Now,  to  the  subject  at  hand.  At  the  1973  MTA  Confer¬ 
ence,  General  John  w.  Roberts,  then  Air  Force  Deputy  Chief 
of  Staff  for  Personnel,  proposed  several  refinements  needed 
if  future  military  personnel  tests  were  to  make  significant 
contributions  in  the  improvement  of  military  personnel 
management.  Those  suggestions  were  subsequently  adopted 
as  objectives/standards  for  the  development  of  future  Air 
Force  testing.  Let  me  summarize  them  for  you: 

First,  tents  of  the  future  should  focus  more  on  the 
individual — on  hi*  unique  talents  and  desires. 

Second,  tests  should  maximize  the  opportunity  for 
choice--both  by  the  individual  and  by  his  employer. 

Third — as  a  function  of  the  first  two — tests  should 
provide  broader  profiles  of  information. 

Since  tho  1973  Conference,  we  have  continued  our 
efforts  to  streamline  our  personnel  management  techniques 
to  make  more  effective  use  of  our  available  human  talent. 
These  efforts  have  included  new  and  more  sophisticated 
ways  to  assess  the  aptitudes  and  attitudes  of  our  people. 
Since  this  is  the  first  time  the  Air  Force  has  hosted  the 
JfTA  since  1973,  it  now  seems  particularly  appropriate  to 
report  on  some  of  our  ongoing  testing  projects  and  see  how 
well  they  meet  the  standards  established  at  that  time.  In 
so  doing,  I  believe  that  trends  for  future  testing  will 
begin  to  emerge. 

Computerized  Adaptive  Testing 

Computerized  Adaptive  Testing  represents  the  first 
real  potential  breakthrough  in  the  personnel  testing  area 
in  the  last  25  years.  Although  the  most  noticeable  change 
in  the  new  method  of  testing  is  the  fact  that  the  test  is 
administered  by  computer,  the  essential  difference  between 
this  method  and  paper-and-pencil  tests  is  that  each  examinee 
will  answer  a  special  set  of  test  questions  "tailored"  to 
his/her  ability.  Adaptive  testing  is  a  way  of  allowing 
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those  tested  to  answer  only  those  questions  that  are 
suited  to  their  individual  ability.  This  contrasts  with 
conventional  group  testing  procedures  which  require  many 
people  to  spend  time  on  questions  that  are  either  too  easy 
or  difficult  for  them. 

Computerised  adaptive  testing  can  have  major  benefits , 
both  in  efficiency  and  test  quality.  A  test  can  be  taken 
at  any  time;  no  examiner  time  or  special  scheduling  is 
needed.  The  examination  time  will  be  shorter;  several 
abilities  can  be  tested  in  the  time  it  now  takes  for  one. 
And,  because  examinees  cannot  be  sure  which  questions  will 
be  asked,  it  retards,  if  not  eliminates,  the  problem  of 
test  compromise. 

Air  Force  research  in  this  area  is  directed  toward 
possible  application  at  the  66  nation-wide  armed  forces 
examining  and  entrance  stations  (AFEES) .  The  Air  Force 
Human  Resources  Laboratory  (AFHRL)  has  already  prepared 
a  prototype  demonstration  model  which  is  currently  on-line 
at  the  3an  Antonio  AFEES.  (In  fact,  this  prototype  is 
available  here  at  the  conference  should  you  care  to  see 
it.)  In  addition  to  providing  personnel  managers  with  a 
look  at  what  computerized  testing  is  all  about,  it  is 
enabling  AFHRL  to  gain  first  hand  knowledge  of  AFEES 
requirements  vis-a-vis  computer  arrays  so  that  future 
hardware  may  bo  more  appropriately  human  engineered. 

Obviously,  before  computers  can  be  used  to  tost 
applicants  for  military  service,  the  cost  of  procuring 
computer  systems,  display  terminals,  and  related  tech¬ 
nology  will  hav*e  to  be  substantially  reduced.  Yet,  we 
know  that  this  is  the  enlistment  testing  of  the  future, 
ao  we  have  started  planning  for  it  now.  In  that  regard, 
a  joint  service  working  group  on  R&D  applications  of  com¬ 
puter  technology  to  military  personnel  acquisitions  has 
been  formed  to  oversee  and  coordinate  work  in  each  of  the 
services.  In  addition  to  RfcD,  implementation  poses  unique 
Icgistical/raanagerial  problems  which  need  to  be  addressed. 
The  U.S.  Civil  Service  Commission  has  set  up  a  task  force 
to  consider  those  kinds  of  issues.  We  trust  they  will 
share  their  experiences  with  us.  In  any  event,  it  behooves 
us  to  look  at  all  aspects  of  testing  by  computer  so  when 
the  "Science"  is  finally  ready,  so  is  "Management." 

Prediction  of  Motivational  Attritions 


The  high  rate  of  involuntary  attrition  that  occurs 
among  military  personnel  is  the  subject  of  growing  concern 
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at  all  levels  of  the  Department  of  Defense  (DOD) .  The 
Defense  Manpower  Commission  in  a  recent  report  has  noted 
that  DOD  incurs  an  annual  cost  of  approximately  one 
billion  dollars  because  one  out  of  every  four  DOD  acces¬ 
sions  is  involuntarily  separated  prior  to  completion  of 
the  first  term  of  enlistment.  A  great  percentage  of 
those  discharged  are  identified  by  the  training  centers 
in  the  early  stages  of  the  enlistee's  basic  or  recruit 
training.  Each  of  the  services  operates  a  program  designed 
to  identify  as  early  as  possible  those  who  will  ultimately 
fail  to  adapt  to  the  military  service  and  to  separate  them 
administratively. 

The  use  of  the  early  discharge  programs  as  a  screen¬ 
ing  device  is  both  inefficient  and  costly,  compared  to 
screening  programs  operated  at  the  point  of  entry,  prior 
to  enlistment.  Thus,  one  solution  to  the  problem  is  to 
increase  the  effectiveness  of  the  pre-enlistment  selection 
system  so  as  to  better  predict  the  probability  of  an 
j.ndiv_dual ' s  successful  adaptation  to  the  military  life. 

One  effort  to  solve  this  problem  which  is  currently 
ongoing  is  the  Motivational  Attrition  Prediction  (MAP) 
model.  This  new  approach  applies  the  maximum  likelihood 
estimation  technique  which  we  believe  will  achieve  better 
differentiation  between  potential  failures  and  successes. 

The  model  was  initially  tested  at  the  United  States 
Air  Force  Academy.  To  evaluate  the  method  in  an  opera¬ 
tional  setting,  i.o.,  as  a  screening  device,  a  predic¬ 
tion  equation  was  developed  using  the  Class  of  1977 
and  applied  a  priori  to  the  Class  of  1979.  Within  six 
months,  <9%  of  the  predicted  failures  had  resigned. 

Because  results  of  the  Air  Force  Academy  study 
were  so  promising,  the  model  was  next  tested  using  15,000 
1972  Air  Force  enlistees  to  predict  their  first  term 
involuntary  attrition.  After  estimating  and  applying  a 
new  predition  equation,  571  of  the  group  predicted  to 
fail  had  been  involuntarily  discharged  prior  to  com¬ 
pleting  their  enlistment. 

Such  success  with  the  MAP  model  has  prompted  still 
further  research  into  its  utility  as  a  pre-enlistment 
screening  tool.  Beginning  in  May  1977,  a  comprehensive 
joint  service  validation  study  was  initiated.  Data, 
including  aptitudinal  and  biographical  information,  were 
collected  on  more  than  70,000  applicants  for  all  military 
services.  Predicted  attritions  based  on  the  equation 
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derived  from  the  1972  Air  Force  sample  will  first  be  com¬ 
pared  against  actual  service  attritions  to  evaluate  the 
accuracy  of  the  model.  Refinement  of  that  equation,  as 
well  as  development  of  ones  specific  for  each  service, 
will  also  be  accomplished  as  the  attrition  data  further 
matures. 

If  the  MAF  model  works  as  well  operationally  as 
it  did  experimentally  with  the  1972  Air  Force  recruits, 
all  services  car.  benefit.  The  higher  retention  caused 
by  selecting  recruits  more  likely  to  complete  their 
first  enlistment  would  save  not  only  in  training  costs 
but  would  also  enhance  recruiting. 

Literacy  and  the  Measurement  of  Reading 

Recently,  the  General  Accounting  Office  (GAO)  has 
submitted  a  report  on  illiteracy  in  the  military  services 
to  the  Secretary  of  Defense  and  recommended,  among  other 
things,  that  DOD  have  the  services  establish  an  overall 
minimum  reading  level  required  for  enlistment  and  deter¬ 
mine  the  reading  grade  level  required  for  each  military 
occupation.  In  addition,  the  Congress  has  become  con¬ 
cerned  about  the  problem  of  the  services'  attempting  to 
correct  educational  deficiencies  of  enlistees  after  they 
enter  active  duty.  Congress  has  suggested  that  perhaps 
a  more  efficient  approach  would  be  for  potential  enlistees 
with  educational  weaknesses  to  receive  basic  skills 
training  prior  to  enlistment.  Accordingly,  the  Secre¬ 
taries  of  Health,  Education  and  Welfare;  and  Labor,  in 
coordination  with  the  Secretary  of  Defense,  have  been 
requested  to  develop  such  a  basic  skills  program.  Of 
course,  one  aspect  of  this  typo  of  education  would  be 
remedial  reading. 

Clearly,  the  next  step  in  attacking  the  literacy 
problem  is  to  include  reading  comprehension  as  one  o? 
our  criteria  for  enlistment  eligibility.  Currently, 
we  have  no  direct  measure  of  reading  ability — only  an 
approximation  derived  from  our  aptitude  battery.  As  a 
result,  individuals  with  literacy  problems  are  not 
identified  until  after  they  er.perience  academic  or  job 
performance  difficulties.  Obviously,  both  the  Congress 
and  GAO  believe  something  should  be  done  to  alleviate 
this  service-wide  situation. 

As  a  first  step,  we  are  developing  our  own  reading 
test;  if  it  proves  valid  and  equitable  for  all  groups, 
we  plan  to  consider  its  use  as  a  screening  device  to 


select  out  individuals  who  have  inadequate  reading  skills. 
Moreover,  since  GAO  has  recommended  such  an  approach  for 
DOD  implementation,  there  is  the  very  real  possibility 
that  a  reading  test  may  be  incorporated  as  part  of  ASVAB 
at  some  future  date.  Then,  as  we  gain  experience  in 
determining  the  literacy  requirements  of  military 
occupations,  we  could  also  attempt  to  match  the  reading 
skills  of  personnel  (as  measured  by  the  reading  test) 
co  the  reading  demands  of  our  jobs. 

This  discussion  of  testing  initiatives  within  the 
Air  Force  is  by  no  means  a  full  and  comprehensive  one. 
Other  efforts  which  merit  attention ,  such  as  development 
and  validation  of  vocational  interest  measures,  pre~ 
enlistment  aptitude  tests,  non-verbal  aptitude  tests, 
and  peroeptua 1/psychomotor  devices,  were  omitted.  How¬ 
ever,  my  purpose  in  selecting  those  mentioned  was  to 
present  them  as  illustrations  of  current  attempts  to 
improve  our  testing  programs. 

When  I  began  this  morning,  I  promised  to  describe 
some  of  our  ongoing  testing  projects  and  to  see  how 
well  they  met  the  Air  Force  objectives  for  future  mil¬ 
itary  tests.  So  far,  I  have  fulfilled  the  first  part 
of  that  promise;  now,  let  me  turn  to  the  second.  By 
way  of  brief  review,  the  objectives  mentioned  earlier 
indicate  that  future  teats  should  (a)  focus  on  the 
talents  and  desires  of  the  individual,  (b)  help  maxi¬ 
mize  the  opportunity  for  both  employer  and  employee 
choice,  and  (c)  as  a  function  of  the  first  two,  provide 
broader  profiles  of  individuals. 

Well,  computerized  adaptive  testing  and  literacy 
assessment  certainly  focus  on  the  talents  of  individ¬ 
uals.  A  computerized  test  will  bring  additional  pre¬ 
cision  to  the  testing  situation  while  the  reading  test 
will  assess  a  previous  untapped  skill.  In  addition, 
the  vocational  interest  test  and  the  perceptual/psycho¬ 
motor  devices  also  support  this  objective. 

Under  the  second  criterion,  maximizing  employee/ 
employer  choice,  it  seems  that  all  the  previously  dis¬ 
cussed  testing  techniques  qualify.  Improved  motivational 
attrition  prediction  will  allow  us  to  screen  out  more 
of  those  individuals  who  can't  make  a  successful  adjust¬ 
ment  to  service  life.  In  addition,  reading  assessment 
will  permit  individuals  to  enter  occupations  for  which 
they  possess  adequate  literacy  skills.  Then  too,  voca¬ 
tional  interest  measurement  will  give  examinees  the 
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opportunity  to  discover  those  jobs  with  which  they  will 
most  likely  be  satisfied.  So,  it  seems  that  both  the 
individual  and  the  Air  Force  will  be  happier  with  the 
addition  of  these  types  of  tests  to  the  personnel  test 
inventory. 

Finally,  by  virtue  of  satisfying  the  first  two 
objectives,  the  third  one  also  seems  fulfilled.  Cer- 
tainley,  more  and  more  information  about  individuals 
will  be  added  into  the  personnel  selection  and  place¬ 
ment  process.  Thus,  when  evaluated  against  our  stan¬ 
dards  for  future  tests,  it  would  appear  that  we  are 
moving  in  the  right  direction. 

In  conclusion,  I  hope  I  have  conveyed  my  enthu¬ 
siasm  for  the  future  of  military  personnel  testing. 
Today,  we  in  personnel  management  are  facing  problems 
we  have  never  seen  before.  We  have  new  kinds  of  people 
with  differing  education  levels,  skills,  values,  ambi¬ 
tions,  and  life  styles  that  we  must  consider  and  make 
part  of  the  military  family.  To  do  this  requires 
constantly  pushing  the  testing  state-of-the-art.  We 
must  move  away  from  the  aptitude  measurement-only  type 
of  testing  and  toward  a  broader  assessment  of  other 
relevant  dimensions  of  human  behavior.  Obviously,  to 
perfect  such  new  testing  techniques  won't  be  easy.  It's 
good  to  be  present  at  this  conference  and  to  know  there 
are  the  kinds  of  people  represented  here  who  are  dedi¬ 
cated  to  solving  this  problem. 

One  final  comment  now,  if  I  may.  As  you  conduct 
your  deliberations  this  week,  I  hope  you  will  reflect 
on  and  take  pride  in  what  the  past  has  accomplished. 
However,  the  challenge  I  would  impart  to  you  this 
morning  is  to  think  ahead  and  take  all  possible  actions 
to  maximize  the  contributions  of  military  testing  to 
the  personnel  management  of  the  future. 

Thank  you  very  much. 
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THE  DEVELOPMENT  OF  A  PERFORMANCE  APPRAISAL  SYSTEM 
FOR  THE  U,S.  COAST  GUARD* 

Edwin  T.  Cornelius  III 
Milton  D.  Hakel 
The  Ohio  State  University 

Joseph  J.  Cowan 

Headquarters ,  United  States  Coast  Guard 


What  I  would  like  to  do  In  the  next  forty-five  minutes  lit  shore 
with  you  some  of  the  unique  aspects  of  a  year-long  study  to  develop  an 
improved  performance  evaluation  system  for  enlisted  personnel  in  the 
United  States  Coast  Cusrd.  Since  the  project  Involved  developing  an 
lap roved  performance  evaluation  systea,  I  would  like  to  take  soae  tiae 
at  the  beginning  of  the  talk  to  briefly  describe  the  existing  enlisted 
performance  evaluation  systea  in  the  Coast  Guard. 

All  enlisted  personnel  are  rated  twice  a  year.  Regardless  of  type 
of  job  rating  or  level  of  reapomtlblllty  (rank),  all  personnel  are 
evaluated  on  the  same  rating  fora.  Three  attributes  of  Individuals 
are  evaluated:  Proficiency,  Leadership,  and  Conduct.  As  Figure  1 
illustrates,  the  existing  systea  suffers  from  the  usual  perforaance 
rating  problem#  Inherent  in  s  large  bureaucracy.  First,  there  Is  an 
overall  inflation  of  Barks  in  the  system.  The  average  rating  score 
using  the  enlisted  form  is  supposed  to  be  3.3,  l.t..  If  an  individual 
is  performing  In  a  capable  and  dependable  fashion,  and  is  the  type  of 
pc. son  the  Coast  Cuatd  will  promote  on  schedule, ,  he  Is  supposed  to  be 
rated  3.3.  As  Figure  1  illustrates,  the  average  evaluation  is  far 
higher  than  3.3. 

A  second  conclusion  obtained  frou  Figure  1  is  the  dramatic  grade 
effects.  That  Is,  E7's  on  the  average  are  rated  higher  than  E6's, 
and  Kb's  in  turn  are  rated  higher  than  E5'a,  etc.  This  is  true  in 
spite  of  specific  directions  in  the  existing  system  for  raters  to 
evaluate  individuals  in  comparison  with  others  with  the  saae  rank  and 
length  of  service.  In  theory  at  least,  the  average  performing  E9 
should  be  rated  with  the  some  value  (3.3)  as  the  average  performing 
E3. 


A  third  observation  is  the  redundancy  of  Information  in  the  current 
system.  As  you  can  see  from  Figure  1  the  pattern  of  scores  frr  Leader¬ 
ship  and  Proficiency  are  identical.  In  fact,  the  Pearson  correlation 
for  these  data  are  r  *  .90.  Of  course  this  means  that  if  you  know  an 
individual's  score  on  the  Leadership  variable,  you  can  almost  perfectly 
predict  his  score  for  Proficiency  and  vice-versa.  Since  there  Is  no 
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variance  in  th«  Conduct  scores,  essentially  one  piece  of  information 
about  an  individual  is  captured  and  communicated  using  th«  present 
system. 

Table  1  illustrates  the  effect  that  the  operating  characteristics 
of  the  prtsent  system  have  had  on  administrative  uses  for  data  of  this 
sort.  One  use  of  performance  appraisal  data  in  the  Coast  Guard  la  as 
an  aid  in  making  promotion  decisions.  Components  of  the  promotion 
system  include  scores  on  s  paper  and  pencil  exam,  supervisory  ratings 
obtained  through  the  current  performance  evaluation  system,  length  of 
time  in  the  service,  time  in  grade,  and  medals  and  awards.  As  you  can 
see  b”  looking  at  this  table,  the  effect  of  inflation  of  marks  coupled 
with  the  large  be tween-grade  variance  relative  to  the  within  grade 
variance  has  had  the  effect  of  deteriorating  the  contribution  of  perfor¬ 
mance  evaluation  data  by  a  factor  of  50  percent.  Promotions  in  the 
Coast  Guard  today  are  chiefly  determined  by  scores  on  the  paper  and 
pencil  testa  and  length  of  time  in  service.  The  contribution  of  actual 
performance  on  the  Job  as  evaluated  by  supervisors  is  minimal. 

This  Is  the  backdrop  against  which  our  project  started  one  year 
age.  A  lot  has  transpired  in  that  year,  and  I  obviously  can't  give 
you  all  Che  details  of  the  project  in  the  time  allotted  to  me  this 
morning.  Instead,  1  would  like  to  share  with  you  some  of  the  aspects 
of  this  project  that  we  find  different  and  exciting.  These  are 
characteristics  of  the  project  that  we  think  represent  something 
unusual  either  philosophically  or  methodologically  for  studlea  of 
this  sort. 

1 .  Philosophy  of  the  Project 

We  have  had  the  philosophy  from  the  very  beginning  that  the  develop¬ 
ment  of  a  technically  perfect  rating  instrument  by  Itself  would  not 
lead  to  good  performance  appraisal  data  for  the  Coast  Guard.  Despite 
the  fact  that,  you  develop  a  technically  sound  Instrument  based  on 
careful  job  analysis,  and  despite  the  fact  that  it  has  high  user 
acceptability  and  is  administratively  simple  to  operate,  you  can  still 
fall  at  this  business.  You  can  fail  because  whnt  count*  most  in 
obtaining  good  performance  appraisal  information  has  very  little  to 
do  directly  with  tlw*  format  and  psyenuaetric  characteristics  of  the 
rating  instrument.  What  counts  most  is  the  motivation  of  the  raters 
in  the  system  to  rate  accurately,  in  this  regard,  an  important  task 
during  the  last  year  has  been  to  develop  a  raver  icedback  system  that 
can  be  used  to  build  trust  in  the  operating  characteristics  of  the 
marking  system. 

During  the  course  of  this  project  we  spent  many  hours  discussing 
performance  appraisal  problems  with  raters  while  conducting  technical 
conferences  st  field  locations.  A  common  sentiment  expressed  by 
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f  participants  in  all  these  meetings  was  the  sincere  desire  to  rate 

1  subordinates  fairly  and  accurately.  The  officers  and  Chief  Petty 

|  Officers  that  we  talked  to  seemed  to  understand  the  need  for  accurate 

|  performance  data  for  manpower  planning  and  development  purposes.  The 

|  reason  these  raters  rate  leniently  has  nothing  to  do  with  the  dynamics 

I  of  having  to  face  their  men  on  a  day  in  and  day  <'»t  basis,  or  any  of 

the  several  other  explanations  for  lenient  ratings  proposed  by  Bass, 

|  Clickman,  Klpnls  and  others  through  the  years.  The  reason  these 

raters  rate  leniently  is  that  they  don't  trust  the  system.  "I'm 
willing  to  rate  accurately,  but  l  don't  trust  the  other  raters  In  the 
system  to  do  the  same."  In  this  regard  I  discovered  at  one  conference 
an  elaborate  informal  system  used  to  compare  marks  in  an  effort  to 
provide  informal  guidelines  for  determining  the  degree  ot  inflation  to 
be  used  in  evaluating  subordinates. 

There  is  no  reason  that  an  approved  formal  system  can't  provide 
raters  with  data  regarding  tire  distributions  of  evaluation  marks  in 
the  system.  All  the  raters  we  talked  to  agreed  that  if  they  were 
told  how  other  raters  in  the  Coast  Guard  were  rating,  and  if  the 
others  were  using  the  system  properly,  thay  would  no  longer  rate 
leniently.  There  was  almost  unanimous  support  for  a  rater  feedback 
system. 

Regardless  of  the  type  of  loros  that  were  developed  then,  a 
requirement  for  this  project  was  to  develop  a  feedback  system  to 
major  commands  and  to  individual  raters  in  the  field  in  order  to 
maintain  trust  and  openness  in  the  performance  appraisal  aystem. 

2 •  Job  Analysis  Approach 

The  second  aspect  of  this  project  that  1  would  like  to  talk 
about  is  the  approach  that,  we  used  to  study  the  enlisted  Jobs  In  tht 
Coast  Guard.  When  we  started  this  project  one  of  our  most  difficult 
problems  was  to  determine  the  nvasber  of  appraisal  forma  that  abould 
be  developed.  We  were  posed  with  the  problem  of  having  almost  30 
different  job  ratings  and  9  different  ranks  to  study.  A  major 
activity  was  to  determine  hew  the  different  Jobs  sad  ranks  could  be 
collapsed  into  major  groupings  for  which  separate  appraisal  Instruments 
could  be  developed.  Once  this  decision  was  made,  our  task  was  to 
develop  prototype  forms  tliat  were  user-acceptable,  technically  excellent, 
and  easy  to  administer.  In  addition,  the  whole  system  had  to  be 
printed  on  the  front  and  back  of  a  i ingle  page.  And,  oh  yes,  all  this 
was  to  be  accomplished  within  a  10-month  framework. 

You  can  see  that  an  ianediate  problem  was  how  to  quickly  collect 
Job  analysis  Information  from  a  variety  of  different  jobs.  The  usual 
philosophy  for  studying  jobs  In  the  military  has  been  the  task-oriented 
philosophy  advocated  through  the  years  by  Morsh,  fhristal,  Drlskall 
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and  others  in  the  Air  Force.  However,  task  statements  were  only 
available  on  roughly  half  the  enlisted  jobs  in  the  Coast  Guard.  We, 
therefore,  decided  to  adopt  a  different  type  of  job  analysis  philosophy: 
the  worker-oriented  philosophy  advocated  by  McCormick  and  his  associates 
over  the  years  at  Purdue  University, 

As  you  know,  the  wotker-oriented  approach  seeks  to  describe  jobs 
in  terms  of  a  United  number  of  universal  job  elements  that  focus  on 
t he  generalised  human  behaviors  retired  for  work  rather  than  work 
activities  specific  to  an  individual  job.  We  thought  this  particularly 
philosophy  was  well  suited  to  answer  the  question  of  how  to  quickly 
compare  the  large  numoeri?  or  jobs  and  ranks  In  the  Coast  Guard.  Cur 
decision,  then,  was  to  develop  a  single  worker  oriented  questionnaire 
specifically  for  the  Coast  Guard  that  could  be  nailed  to  representatives 
of  all  the  different  job  ratings  and  ranks  of  enlisted  personnel. 

As  a  starting  point  in  this  endeavor  wc  borrowed  heavily  froo  the 
nost  foaous  (and  only)  worker  oriented  questionnaire:  the  Position 
Analysis  Questionnaire,  developed  by  McCoraick  under  another  government 
contract.  We  made  several  changes  in  the  PAQ  to  adapt  it  for  Coast 
Guard  use.  First,  we  deleted  items  that  were  not  at  all  appropriate 
to  the  military  setting.  Of  those  items  that  we  did  keep,  wc  changed 
the  wording  and  the  examples  to  fit  more  clearly  the  Coast  Guard 
setting.  A  major  effort  Involved  reducing  the  reading  level  of  the 
PAQ.  Previous  research  with  the  PAQ  had  shown  that  it  required  a 
post  college  graduate  reading  level.  This  is  fine  for  trained  Job 
analysts,  but  would  not  do  In  a  mass  mail-out  to  enlisted  personnel 
in  the  Coast  Guard  where  the  average  education  level  was  at  the  12th 
grade  level  or  below.  In  this  regard  we  were  successful  in  reducing 
the  reading  love!  from  the  17th  grade  level  to  the  10th  grade  level, 
as  measured  by  a  computer  program  for  that  purpose  that  we  have  at 
Ohio  State. 

Another  significant  change  involved  eliminating  the  variety  of 
different  response  scale  formats  used  on  the  PAQ  We  converted  all 
items  so  that  they  could  be  evaluated  by  the  Relative  Time  Spent  scale 
used  in  the  Air  Force  and  other  military  services.  A  final  revision 
was  to  add  some  fifty  items  tiwit  wc  called  "leadership  process  items." 
in  our  study  of  the  PAQ  wc  found  that  as  a  job  analysis  instrument  it 
was  particularly  deficient  in  the  area  of  differentiating  among  higher 
level  leadership  processes.  Our  source  for  these  additional  supervisory- 
type  items  was  verb  lists  from  previous  task  analyses  performed  in  the 
Coast  Guard. 

We  finally  ended  up  with  a  12-page,  153-element  booklet  that  was 
mailed  to  some  3000  Coast  Guard  enlisted  personnel.  The  sample  was 
equally  represented  by  members  in  all  job  ratings  and  grades.  Incidentally , 
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we  got  a  64  percent  return  rate  at  our  cutoff  point.  Our  final 
sample  consisted  of  responses  from  2,023  individuals. 

3 .  Statistical  Methodology  for  Analysing  the  Job  Analysis  Data 

Let  me  now  describe  the  data  base  that  the  returned  questionnaire* 
provided  us.  First  of  all,  the  data  formed  a  three-dimensional  cube, 
one  facet  of  the  cube  had  153  levels  representing  the  153  different 
worker-oriented  job  elements  on  our  questionnaire.  A  second  facet  of 
the  cube  contained  23  levels  representing  the  23  Job  ratings  in  the 
Coast  Guard  for  which  sufficient  numbers  of  persons  existed  to  be 
included  in  the  statistical  analyses.  The  last  facet  of  the  cube 
contained  9  levels  representing  the  nine  different  ranks  in  the  Cosst 
Guard.  Each  cell  of  this  three-dimensional  matrix  represented  s 
unique  combination  of  job  rating,  rank,  and  job  clement,  and  contained 
from  15-35  observations,  depending  upon  the  return  rate  for  that  cell. 

Mean  relative  time  spent  values  were  computed  across  all  observations 
in  each  cell  to  produce  a  final  153  x  23  x  9  data  matrix  containing 
scan  values. 

The  major  data  analysis  question  facing  us  was  how  to  analyze 
simultaneously  all  facets  of  this  cube  and  cosh.*  up  with  practical 
suggestions  for  the  number  of  forms  that  should  be  developed  for  Coast 
Guard  use.  In  this  regard,  Tucker's  three-mode  factor  analysis  is 
uniquely  designed  to  analyze  data  of  this  sort.  Three-mode  factor 
analysis  proceeds  in  two  stages.  During  the  first  stage  a  separate 
factor  analysts  is  commuted  on  the  separate  modes  ol  the  data  (in 
our  case:  Job  elements,  job  ratings,  and  ranks).  In  the  second 
stage  a  core  matrix  is  created  that  interrelates  factors  from  the 
various  modes  of  the  data. 

To  illustrate  the  kinds  of  meaningful  output  that  this  procedure 
gave  us,  1  have  included  the  results  of  the  Job  grade  factor  analysis 
In  Table  2.  As  you  can  see  there  were  two  factors  that  were  extracted 
and  rotated.  E4’s  and  E5's  had  principal  loadings  on  Factor  1,  E7*s 
US's,  and  1.9'h  had  principal  loadings  on  Factor  II,  and  E6's  had 
loadings  on  both  Factor  I  and  Factor  II.  In  general,  this  told  us 
that  in  terms  of  relative  time  spent  on  these  worker  oriented  items. 

Chief  Petty  Officers  roughly  had  the  same  pattern  of  responses, 
likewise,  E4'»  and  E5*»  could  be  characterised  as  similar,  E6*s, 
however,  were  found  to  be  similar  to  both  groups.  That  is,  some  of 
the  processes  E6's  have  to  exhibit  on  the  job  are  similar  to  Petty 
Officers  and  some  are  similar  to  Chief  Petty  Officers. 

The  analysis  of  the  remaining  two  modes  produced  equally  interpretable 
rjHults,  For  example,  the  factor  analysis  of  the  job  rating  mode  indicated 
that  there  were  five  factors:  one  representing  the  various  electronics 
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ratings,  one  representing  the  aviation  ratings,  one  representing 
the  deck/watch  ratings  (e.g.,  boatswain's  mate),  one  representing 
the  engineering  ratings,  and  a  final  factor  made  up  of  service-type 
ratings  such  .is  musician,  photo  journalist ,  and  hospital  corpsman. 

An  inspection  of  the  core  matrix  provided  the  final  suggestion 
as  to  which  groups  of  job  ratings  and  ranks  could  be  combined.  From 
the  entries  in  the  core  matrix  we  concluded  thst  regerdless  of  which 
job  rating  they  came  from,  the  responses  for  Chief  Petty  Officers 
were  similar  enough  that  they  formed  a  group  by  themselves.  Likewise, 
we  found  that  there  were  five  identifiable  groups  of  Petty  Officers, 
one  for  each  of  the  five  factors  that  1  described  to  you. 

These  results  lead  us  to  recommend  to  the  Coast  Guard  that  they 
implement  a  performance  appraisal  system  that  contains  seven  forma. 

One  form  for  Chief  Petty  Officers,  five  forms  for  the  different  types 
of  Petty  Officers,  and  one  form  for  the  n on-rated  personnel  (Seamen, 
Airmen,  and  Firemen  In  the  El  -  El  ranka) .  We  feel  that  a  system  of 
seven  forms  would  be  maximally  sensitive  to  the  different  types  of 
work  and  levels  of  responsibility  Inherent  in  the  enlisted  personnel 
population  of  the  Coast  Guard. 

We  look  these  suggestions  with  us  to  tire  technical  conferences 
in  the  field  and  essentially  received  support  from  ratura  for  a  system 
of  this  sort.  The  whole  process  still  amazes  m» !  We  started  out 
with  worker  oriented  job  elements,  analyzed  them  with  «  very  complex 
multivariate  statistical  technique,  and  ended  up  with  suggestions  that 
were  practical  and  acceptable  to  people  In  the  field. 

4 •  Emphasis  on  User  Acceptance 

Another  Important  aspect  of  this  project  has  been  our  heavy 
emphasis  on  developing  a  syslar  with  high  user  credibility  and 
acceptability.  Last  spring  we  held  several  formal  technical  conferences 
with  representatives  from  six  different  groups  for  which  these  forms 
were  being  developed.  These  meetings  lasted  one  day  each  ar  were 
held  at  Governor's  Island  in  Hew  York  harbor  and  at  Ellcabet.i  City, 

North  Carol  In...  These  sessions  were  characterized  by  group  exercises, 
structured  questionnaires,  and  open-ended  dlncunsions. 

As  an  illustration  of  how  we  used  suggestions  from  the  field  to 
shape  the  final  forma'  of  these  rating  instruments,  l  would  like  to 
show  you  the  response  seals  that  we  have  included  In  the  final 
proposed  versions  of  the  evaluation  instruments  (see  Table  3).  Aa 
you  can  see,  we  decided  on  a  response  scale  with  five  categories. 

As  en  aside,  you  arc  probably  aware  of  the  controversy  in  the 
literature  regarding  the  number  of  discriminations  that  humans  can 
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reliably  make  when  rating  the  perforswr.ee  of  others.  In  terns  of 
the  optimum  number  of  scale  points,  the  most  often  cited  recommendation 
is  seven,  probably  based  on  Hiller's  1956  paper  on  information  processing 
capacities  of  humans.  However,  some  researchers  have  argued  that  the 
more  points  the  store  reliable  the  data,  and  others  have  argued  that  the 
more  scale  points  the  less  reliable  the  data.  In  reviewing  the  liter* 
ature  in  this  area,  we  were  persuaded  by  a  monte  carlo  study  by  Liaait* 
and  Green  at  the  University  of  Georgia.  These  researchers  demonstrated 
that.  In  general,  the  more  points  on  the  response  scale  the  more 
reliable  the  data.  However,  the  level  of  the  curve  beyond  five  points 
was  such  that  no  practical  increase  in  reliability  car.  be  achieved 
beyond  five  points  on  a  scale. 

Therefore,  vin*n  we  went  to  the  field  sessions  last  spring  our 
bias  was  five  points.  However,  our  mock-up  for  these  sessions  contained 
a  response  format  with  eight  response  categories.  To  our  surprise,  we 
found  an  overwhelming  preference  for  a  rating  scale  with  four  or  five 
points  rather  than  one  with  a  larger  number  of  points  as  in  our  mock-up. 
All  raters  in  the  field  said  that  they  could  confidently  identify  the 
extreme  outlayers  (outstanding  performer*  and  unsatisfactory  performers). 
In  addition,  most  rate  in  said  that  of  those  that  were  left  they  could 
probably  reliably  make  distinctions  among  three  groups  of  personnel, 
roughly  corresponding  to  average,  above  average,  and  below  average 
performance . 

We  tried  to  get  some  consensus  about  how  to  label  these  five 
caregorlcs,  but  found  it  difficult.  1. very one  agreed  to  the  labels 
"outstanding"  and  "unsatisfactory"  (or  the  two  extreme  rating  categories 
on  eigher  end  of  the  scale.  However,  there  was  no  strong  indication  of 
Stow  to  label  the  three  categories  In  the  middle  of  the  scale  in  a  way 
that  would  convey  the  sane  meaning  to  all  raters  in  the  field.  It  was 
finally  suggested  that  regardless  of  what  the  middle  three  categories 
were  labeled,  the  moot  helpful  information  would  be  an  indication  of 
the  suggested  distribution  of  rateos  that  should  fail  in  these  middle 
categories.  >\«  you  can  see  in  Table  3  all  three  boxes  are  simply 
labeled  "good,"  and  the  values  10X-70X-1QX  have  been  included  on  the 
final  format.  Rater#  in  the  Held  felt  that  the  values  10-70-10  would 
interpret  and  give  meaning  to  what  was  meant  by  the  three  levels  of 
"good"  performance,  These  values  (10-70-10),  Incidentally,  reflect  to  a 
great  extent  what  the  raters  in  the  field  felt  the  actual  distribution 
of  talent  in  the  Coast  Guard  was.  That  is,  moat  raters  believed  that 
the  overwhelm tug  majority  of  enlisted  personnel  in  the  Coast  Cuarc  were 
doing  a  good,  capable  job.  In  addition,  these  raters  felt  that  a  much 
smaller  percentage  of  Coast  Guard  personnel  performed  a  little  better 
and  a  little  worse  titan  this  majority.  And,  finally,  an  extremely 
small  percentage  (it)  were  either  outstanding  or  unsatisfactory 
per  forme r». 
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Decisions  about  other  characteristics  of  the  final  evaluation 
forms  were  also  made  during  these  field  conferences.  The  final 
seven  farms  in  the  system,  for  example,  each  have  two  major  blocks 
ot  rating  items:  those  measuring  personal  qualities  (e.g,.  Depend¬ 
ability,  Initiative)  and  those  measuring  performance  of  duties. 

The  list  of  personal  qualities  to  be  rated  are  constant  across  ail 
forms.  That  is,  we  felt  Chat  the  personal  attributes  important  for 
success  as  a  Petty  01  liter  Boatswain's  Hate  are  the  some  as  the  personal 
attributes  necessary  for  a  Petty  Officer  Radioman.  However,  the  perfor¬ 
mance  of  duties  items  were  for  the  most  part  unique  to  each  form.  These 
items  were  selected  from  the  three-mode  factor  analysis  output  and 
tended  to  he  items  that  had  high  relative  time  spent  ratings  and  at 
Use  same  time  were  important  in  di f let unt iat ing  among  the  groups. 

The  number  and  definitions  of  the  personal  qualities  changed 
somewhat  as  a  result  of  the  field  conferences.  For  example,  the 
raters  told  us  that  they  would  find  it  difficult  to  differentiate 
between  the  traits  Motivation  and  Initiative.  Therefore,  on  the 
final  form  we  combined  the  two  into  a  single  definition  under  the 
trait  Initiative.  An  invariable  request  from  the  field  settings  was 
to  include  the  trait  "Military  Bearing"  in  the  final  list  of  attributes 
to  be  rated.  Likewise,  we  discovered  that  a  number  of  the  performance 
of  duties  items  taken  from  the  Job  inventory  with  good  statistical 
properties  were  not  pprticularly  meaningful  to  raters  In  the  field 
and  were,  therefore,  deleted. 

One  thing  that  l  think  was  unique  about  our  approach  is  that  we 
took  these  suggestions  seriously.  I  know  that  there  is  a  lot  in  the 
advice-giving  literature  in  performance  appraisal  to  suggest  that 
you  should  avoid  using  personal  traits  on  evaluation  forms,  particularly 
traits  such  as  Military  Bearing.  However,  one  of  our  prime  Interesta 
was  in  the  attitudes  of  raters  in  the  field.  We  felt  that  before  you 
can  expect  raters  to  give  you  good  data  you  must  have  an  instrument 
that  has  credibility  ..nd  is  acceptable  and  meaningful  to  the  people 
who  are  going  to  use  them.  (Moreover,  on  a  more  technical  level,  the 
suggestion  by  Kavanaugh  that  the  evidence  is  not  yet  in  on  the 
superioritv  of  behavior  ratings  versus  trait  ratings  is  due  some 
considcrat ion) . 

S .  lA£erJ^’nt_al_  Test  Per  lod 

Hie  final  aspect  of  this  project  that  1  think  is  important  to 
relate  to  you  is  the  commitment  on  the  pari  of  the  Coast  Guard  for  an 
experimental  tryout  of  the  proposed  forms  before  they  are  implemented 
system-wide.  The  purpose  of  the  field  tryout  will  he  to  Investigate 
the  psychometric  characteristics  and  the  psychological  reactions  of 
the  proposed  system  under  "live"  conditions.  We  will  try  to  find  out 
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whether  or  not  the  new  system  with  seven  forms  can  improve  upon  the 
existing  system  that  I  described  at  the  beginning  of  this  calk.  As 
part  of  the  current  project,  then,  we  have  proposed  an  experimental 
design  whereby  the  various  aspects  of  the  proposed  system  can  be 
tested  and  evaluated. 

Cone lus ions .  Before  I  leave  you  this  morning  I'd  like  to  comment 
on  how  we  think  this  new  appraisal  system  will  solve  some  of  the  ills 

of  the  existing  system  that  l  outlined  45  minutes  ago.  First  of  all, 

we  hope  that  grade  effects  la  the  marking  system  will  be  reduced 
simply  by  cite  fact  that  high  ranking  NCO's  will  now  b>*  evaluated  on 
a  separate  form  than  lower  ranking  NCO's.  We  think  this  will  make  It 
more  palatable  for  raters  in  the  field  to  rate  an  E?  or  an  E8  a o 

average,  in  comparison  with  other  E7's  or  E8'a  in  the  Coast  Guard. 

Secondly,  we  think  that  the  tendency  of  raters  to  rate  leniently  will 
be  reduced  on  a  form  that  dramatizes  the  fact  that  90  percent  of  the 
personnel  should  be  rated  in  the  middle  three  blocks  on  the  form. 

If  the  feedback  system  works  and  raters  begin  rating  average  perfor¬ 
mance  in  the  average  block,  we  should  get  the  kind  of  discrimination 
in  the  system  that  is  needed  In  order  to  identify  only  the  outstanding 
candidates  for  promotion.  With  respect  to  redundancy  c>f  information, 
we  don't  kid  ourselves  about  the  fact  that  v*  will  get  high  correlations 
among  the  many  different  items  on  the  form.  However,  by  breaking  down 
complex  concepts  such  as  Leadership  and  Proficiency  we  offer  the  rater 
the  possibility  to  rate  differentially  within  n  single  rstnc,  rather 
than  to  rate  on  the  basis  of  an  overall  global  evaluation.  All  these 
ideas,  of  course,  will  be  tented  in  the  experimental  phase. 

In  summary,  we  have  developed  a  promotion  system  for  enlisted 
personnel  that  contains  a  rater  feedback  and  monitoring  component 
for  a  collection  of  seven  separate  evaluation  forms.  The  unique 
aspects  of  this  project  that  we  have  talked  about  this  morning  are 
1)  the  philosophy  regard i  »g  the  most  important  determinant  (rater 
motivation)  of  effective  performance  appraisal  data,  2)  the  worker- 
oriented  Inventory  that  ve  developed,  3)  the  statistical  methodology 
that  we  used  to  simultaneously  analyze  the  three  modes  of  the  worker* 
oriented  Job  analysis  data,  4)  the  emphasis  we  have  had  on  the 
importance  of  user  acceptance,  and  5)  the  field  experimental  tryout 
of  ihe  proposed  system. 


Tablv  i 


Actual  Contribut ions  of  Factors  for  Advancement 
in  Kate  During  March  1976  E3-E8  Servicewide 
Competition  for  Advancement* 

Percent  Contribution 


Factor 

Intended 

Actual 

Examination  Score 

4  4 

40 

Performance  Evalu.it  ions 

28 

15 

Time  in  Service 

11 

38 

Time  in  Par grade 

11 

6 

Medals  and  Awards 

6 

1 

*T.iken  Inns  F.  Stump! i  and  R.  D.  Chnvalicr,  Art  A naitjSlS 
>tm i  Pwpoidf  ji't  KVetiuut  the  Coast  GuaAd  Enlisted 
PeX&e 'mince  l  valuation  System.  Thesis  submitted  to  the 
Naval  Post  Craduatc  School,  Monterey,  California,  December, 
l97o. 
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Table  2 

Var  intax  Rotated  Eigenvectors  for  the 
Two-Dimensional  Approximat ion  of 
the  Grade  Mode  Variance 

Vectors 


G  rade 

l 

11 

E-4 

.74 

-.11 

K-S 

.59 

.06 

E-6 

.30 

.35 

K~7 

.01 

.64 

E8/E9 

-.09 

.67 

IS 


* 

& 


g 

\ 

4 

4 


f 


Table  i 

Response  Categories  on  the  Final  Rating  Fon* 
RESPONSE  CATEGORIES 


Not 

Unsatis¬ 

Out¬ 

Observed 

factory 

Good 

standing 

|  | 

1 

1  I 

5X 

10X 

?UX  10Z 

5Z 

20 


Taken  f roo  J.  F.  Stuapff  and  R.  D.  Chevalier,  An  AiuiCijiiS  and  PlOpCSaZ 
Rcvotoi!  c£  the  Coast  Guaid  EnCistcd  Pc\&o>\inincc  Evatuation  System.  Thesis 
submitted  to  the  Naval  Post  Graduate  School,  Monterey,  California,  December,  1976. 


Notes  on  the  Feasibility  of  Predicting 
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ABSTRACT 

History  has  demonstrated  that  there  is  a  pressing  need  for  improved 
selection  and  training  of  fighter  pilots.  In  World  War  II,  only  one  of 
twenty  pilots  became  an  ace.  The  U.S.  Air  Force  kill  ratio  in  Southeast 
Asia  was  approximately  2.5  to  1.  In  contrast,  the  Israelis  claim  to  have 
a  kill  ratio  of  60  to  1 . 

Our  feasibility  study  has  focused  upon  enhancing  our  record  of  air- 
to-air  combat  kills  through  more  stringent  and  comprehensive  selection 
procedures  Evidence  is  presented  which  demonstrates  that  a  program  can 
be  developed  to  select  pilots  who  will  be  effective  in  air-to-air  combat. 

From  reviews  of  U.S.  and  foreign  selection  research  dating  from 
World  War  l!  to  the  present  and  an  assessment  of  pilot  opinion  from 
hundreds  of  aces,  45  factors  were  identified  as  potential  predictors  of 
fighter  pilot  combat  effectiveness.  Of  these  45  factors,  only  10  are 
adequately  evaluated  within  current  military  selection  programs  upon 
entrance  into  pilot  training.  Assessment  of  the  remaining  35  untapped 
factors  is  within  our  technological  reach.  In  fact,  many  of  these 
factors  can  be  assessed  by  tests  which  are  presently  available. 

We  developed  an  Air  Combat  Effectiveness  Study  (ACES)  program  which 
would  establish  selection  test  measurer,  for  virtually  all  of  the  factors 
identified  as  underlying  fighter  pilot  combat  effectiveness.  As  part  of 
the  ACES  program,  selection  test  measures  would  be  validated  against 
performance  in  air  combat  maneuvering  ranges,  thereby  providing  a  method 
for  selecting  fighter  pilots  during  peacetime.  We  have  emphasized 
selection  for  success  in  the  operational  environment  rather  than  success 
in  training. 

Armed  with  these  selection  test  scores  and  an  effectively  executed 
validation  program,  researchers  should,  for  the  first  time  in  history, 
be  able  to  specify  a  definitive  profile  of  the  ace  fighter  pilot. 


1  This  paper  is  based  upon  ARPA  (Defense  Advanced  Research  Projects 
Agency)  Contract  No.  HDA-9-3-76-C-0169,  "Feasibility  Study  to  Predict 
Combat  Effectiveness  for  Selected  Military  Roles:  Fighter  Pilot 
Effectiveness”  by  E.  W.  Youngling,  S.  H.  Levine,  J.  3.  Mocharnuk,  and 
L.  M.  Weston,  dated  29  April  1977,  MPC  Report  El 634 . 
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Notes  or.  the  Feasibility  of  Predicting 
Fighter  Pilot  Effectiveness 


Fighter  pilot  combat  effectiveness  was  selected  for  investigation 
for  several  reasons.  Top  level  DoO  concern  with  combat  effectiveness 
is  always  present  and  the  Defense  Advanced  Research  Projects  Agency  was 
interested  in  the  importance  of  manning  high  cost  weapons  systems.  In 
those  systems,  the  cost  of  developing  and  implementing  proper  selection 
and  training  programs  should  be  quite  small  relative  to  the  total  weapons 
system  cost.  Because  the  McDonnell  Douglas  Corporation  is  intimately 
concerned  with  fighter  aircraft  systems  effectiveness,  the  fighter  pilot 
role  was  of  special  interest  to  us.  Also,  recently  developed  combat 
maneuvering  ranges  were  considered,  with  simulators,  as  potential  tools 
for  use  in  selection  as  well  as  training.  Compl ©men ting  these  interests, 
the  history  of  air  combat  has  demonstrated  that  a  need  for  improved  pilot 
selection  and  training  exists. 

In  World  War  II  we  sent  many  pilots  to  war;  some  survived  their 
early  combat  engagements,  became  skilled  at  their  craft,  and  went  on  to 
become  aces  -  but  they  were  the  exception.  According  to  official  Eighth 
Air  Force  records  of  approximately  ‘WX’  fighter  pilots  who  flew  against 
the  Germans  during  1943  -  1945,  relatively  few  became  aces.  Only  261 
(about  5.2  percent)  achieved  this  goal.  However,  this  small  group  of 
men  were  responsible  for  40  percent  of  the  total  5284.5  German  planes 
destroyed  by  the  Eighth  Air  Force  fighter  pilots  during  that  period; 
thus,  5  percent  claimed  40  percent  of  the  Mils  (Eighth  Air  Force,  1945). 

In  the  Korean  War,  once  again  it  was  found  that  a  small  percentage 
of  Air  Force  pilots  were  obtaining  most  of  the  kills.  Here  again,  the 
largest  group  of  pi’ots  recorded  no  kills  (53.5  percent)  while  a  small 
group  of  38  pilots  (4.8  percent)  became  aces.  Importantly,  each  of  the 
F-86  pilots  had  at  least  25  counterair  missions  and,  therefore,  presum¬ 
ably,  a  fair  chance  for  a  kill  (Torrance,  Rush,  Kohn,  and  Doughty,  1957). 
Clearly,  the  aces,  a  small  group,  make  an  overwhelming  contribution  to 
air-to-air  kill  records  and  air  supremacy.  We  must  find  a  way  to  augment 
our  record  of  air-to-air  kills,  especially  since  we  will  probably  be  able 
to  field  only  a  relatively  small  number  of  fighter  pilots  in  future  wars. 

Fighter  pilots  have  recommended  that  the  way  to  improve  air-to-air 
performance  is  to  select  a  man  according  to  more  rigid  standards,  give 
him  specialized  training,  and  keep  him  in  the  cockpit.  We  feel  that  it 
is  prudent  to  seriously  consider  the  recommendations  of  the  fighter 
pilot  community. 


-J ;  > 


9 

comat  *»*!£3* 


lCt*T#«  JT*MJ 
V 

COMMIT  MTMTti 


5  5Li.*r*/TA#* 

i  »v*tw*uct*» 

‘  »tf*AIK» 

t  „  -  . 


■I 


r  *aKMMit*** 

(  HWt«*MllCI«a 

Lr 


IMM 

MiacMMWian 


uc*ai 


i _ 


_ ! 


of  r»i«f*ct«: 

JMMUf  IX  M0f*l 

Cf  TH  ((awn  (fftttM 

*MU 

M.TVMM  fturtu  mai_ 

MM* 

e» 

r»cnw 

t  m  cmum.'i  Mvowtui' 
roitofooot  *s*i. 

«tointn*i.ocicAi 

7 

urn  stifcTOMwcu. 
txt««_ 

iftITVOU 

7 

.  «ncM  Mum 
,  aoa  mot  comitn 

MCVMMUK 

s 

.■miKtxcf 

MTECMTEO 

p*ofiu 

«MO*f4Bro*lM.ltXl 

i 

•  MkUl  MJtU 
.ttOMlMJUl 
.  XCNtMCM.  MOU 

AV lATpftWUl 

4 

.xamottit* 

Of  THE 

no*ut«aiM« 

.UWKWW 

C0«AT 

ntUMHjri 

1 

J  <M»nOM)l  «CT0*  MUtMt 

MW  MO  PUtMT  NWOMt 

EFFECTIVE 

nnmuu  - 

l 

MCOMI  MVtUMO  MMK  IK 

UAMAMT 

metn  am  ik 

FICHTE*  PHOT 

VINTM 

1 

>0IM*r*CI0M«*Utf  Of 

COM*l  MCMOM  *W  COM)) 

•actor  toftoni 

« 

fUttlt  IUUMI  Ml  *rt*0- 
mittt  mnoatutw* 

M>  UUK 

1 

mot  lane 

UX*4  »  AC  TOtS 

1 

uf*cion<M*ot 
uxoamt  mu# 

M0Cft4**CAl  MIA 

< 

rat.  ft*ocf»iftfct?wMt 

— 

tOMftJlTtfT  Oft  fttVU0»*ffMf«4AT| 

tout 

_ 

• 

uucTioa  mu  r «  two*  t*ctm 
immiAU  mncoftti  o«  tnim 
iactohs  toftert  n*fqm*ct  m 
temmmeamr. 

»■  CMktt  IfMCIMKU  III**  *«»  mi IM 


KU  mot  »*  £<•*>  MlrtW 
Wad  fWOMMU  "Me*** 

cun  u*iu' turns  uumininMi 

»  no  non  t*  tj.^»  mi  irMCijnwn 


MUUUCIM  HJ1 


yroftiKMi  mu  am 

1 

KAU  AftftftftATVI  m 

<■ 

«W«<WCtl 

Mtaa  »t  mnm>  om*  j 

COMAT  m  VUTU  tUU 

itucttot  tun 

»AU*|*K1««  j 

.tftint  tcrtnnMO 

.  mot  cowwtu 

♦  MU  COIftCCAU 

fwn®A0MCA\  JTftttl 

.  -.jmfvnt 

•  CGUMU 

•  ruc*T  AftITyftt 

tt-xueo 

tNrcfttftun  romi 

MMftAl  AM  COMAT* 

.  ftewnini  mu 

mcami 

•  uucr«t  AfUftton 

.  mtfKtrai 

•  fmwcA  m  ctiMAt 

•  MClMQftTNK 

UAttAM# 

•  Aitftmsi 

o&iKMMtttMlIft 

,  •HUM T Art  04 (AOA  i 

1 

•  U’J  M 

k  VA4JAI  ftlftCffttlOl 

| 

•  rtarcMMCt  c*w»  nn« 

♦  r«a<* 

•  1*07 KJ*K  COfttft  1 

*^7Ml  Oftr.HTATIOft 

•  Aftiim  tottnoTAft* 

♦  t’At  -.1  UACmiidi 

1 

r\r cacioucAi  run 

r  ’itf* 

itttitn  touftAKi 

P  f  .ftCIftMt,  WO 

•  tUlJftCMB  tOUU 

•  riAftftC 

t 

! 

CMOMUO  ftftO  I 

•  TfAffKMR  ) 

•  MCiftMifr 

«Aft«Ai  cum** 

i 

1 

1 

,  rtM»Ml  WT*K 


ui  c«n*  i;an*uarr  ««lo»io«  wm 

imcaicatic*  j»  no  cconr  ht>  ttQimt  rciuu 
«mva«<’  or  n*  iwi  tirutMMn  »  •«»«:  men,  u» 
caouawno  »*  vk*  swo  u  ^n»iwti  no  u  urn. 


FICU*E 1  Ain-TOWUH  COHMT  EfftfTlVWtJl  rfUIHUTY  JTUGV 


24 


Much  of  the  present  study  is  concerned  with  the  issue  of  finding  a 
better  way  to  select  men  for  the  fighter  pilot  role.  We  have  emphasized 
air-to-air  combat.  The  development  of  a  selection  program  of  this  nature 
requires  several  discrete  operations  *  a  job  analysis  of  the  fighter 
pilot  task,  the  generation  of  testable  trait  hypotheses,  the  development 
of  predictor  variables,  validation,  and  cross  validation.  In  this  study, 
an  extensive  and  comprehensive  job  analysis  was  performed,  hvpotheses 
ore  established,  and  programs  were  outlined  for  execution  of  the  remain¬ 
ing  operations.  We  conclude  that  a  program  can  be  instituted  which  could 
select  men  at  entry  into  the  military  who  would  prove  to  be  effective 
air-to-air  pilots.  For  validation  purposes,  these  pilots  can  demonstrate 
their  effectiveness,  during  peacetime,  in  air  combat  maneuvering  range 
engagements.  High  quality  intermediate  criteria  can  be  developed  in  this 
type  of  facility.  Furthermore,  we  believe  it  is  possible  to  select 
pilots  who  have  the  motivational  characteristics  of  those  who  will  fight 
effectively  in  actual  combat. 

Figure  1  contains  a  portrayal  of  the  classes  of  information  evaluated, 
the  resultant  profile  of  the  combat  effective,  ar.d  the  ACES  program.  The 
principal  work  elements  of  this  study  focused  on  the  identification  of 
those  critical  characteristics  and  skills  which  are  thought  to  character¬ 
ize  the  combat  effective  pilot.  In  all,  five  major  sources  of  Information 
were  used  to  generate  the  integrated  profile  of  the  combat  effective  pilot 
shown  in  Figure  1.  This  orofiie  is  bused  on  our  comprehensive  and  system¬ 
atic  review  of  World  War  II,  Korean,  and  Southeast  Asian  conflict  informa¬ 
tion  and  deals  with  the  scientific  studies  of  combat  aviation  and  combat 
infantry  for  those  wars.  The  U.S.  Military  Aviation  Selection  research 
from  World  War  II  to  the  present  time  was  reviewed.  This  literature, 
which  focuses  upon  characteristics  and  critical  skills  whitn  predict 
success  in  flight  training,  has  been  used  as  a  source  of  hypothesis 
generation  because,  by  implication,  these  factors  should  be  related  to 
combat  effectiveness.  Wp  also  reviewed  the  German  and  Japanese  World  War 
II  aviation  selection  research  programs  as  well  as  the  current  Israeli 
program.  Our  final  source  o?’  information  came  from  373  questionnaires 
which  were  returned  to  us  by  fighter  pilots.  Significantly,  280  of  these 
returns  were  either  from  ace  aviators  or  aviators  with  MIG  kills  in 
Southeast  Asia.  The  fighter  pilot  organizations  ware  surveyed  and  their 
response  rates  are  reflected  in  Figure  2. 

Using  tne  data  and  inputs  from  the  five  sources  of  1  '(formation,  we 
generated  the  integrated  profile  of  the  combat  effective  fighter  pilot 
which  is  shown  in  the  accompanying  figure.  In  all,  some  45  factors  dis¬ 
tributed  among  12  major  domains  (see  Figure  1)  can  be  reasonably  hypothe 
sized  to  be  of  predictive  value  in  Identifying  the  combat  effective  air- 
to-air  fighter  pilot.  As  an  example,  of  Che  45  factors  which  car, 
legitimately  be  supposed  to  underlie  air-to-air  fignter  pilot  combat 
effectiveness,  35  of  them  are  not  adequately  tested  for  by  the  U.S.  Air 
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PILOT  ORGANIZATION 

NO, 

SENT 

NO. 

BACK 

% 

AMERICAN  FIGHTER  ACES  ASSOCIATION 

624 

257 

49 

NAVY  MIG  KILLERS  ASSOCIATION 

42 

22 

62 

RED  RIVER  VALLEY  FIGHTER  PILOTS  ASSOCIATION 

96 

51 

5i 

NATIONAL  GUARD  PILOTS 

40 

13 

33 

ACEVAL  -  AIMVAL  NAVY  PILOTS 

12 

12 

100 

ACEVAl  -  AIMVAL  AIR  FORCE  PILOTS 

9 

• 

AGGRESSOR  SQUADRON  AT  NELLIS 

12 

12 

100 

TOTAL 

_ 

736 

373 

61 

FIGURE  2  PILOT  ORGANIZATION  AND  RESPONSES 

Force  in  their  t^ntry  selection  program.  Since  the  methodology  either 
exists  or  is  within  our  technological  grasp  for  testing  the  bulk  of  these 
hypothesized  predictor  variables,  we  believe  that  a  prima  facie  case  has 
been  made  for  the  overall  feasibility  of  such  a  research  urogram. 

In  the  lower  right  hand  portion  of  Figure  1  is  an  outline  of  the 
ACES  selection  test  program  which  would  test  for  the  33  factors  listed 
within  the  figure.  The  hypothesized  predictor  variables  are  grouped 
according  to  the  class  of  testing  device  which  emerged  as  most  appropri¬ 
ate  in  our  analysis  of  cost-effectiveness  and  practicality.  We  concept¬ 
ualize  a  selection  test  program  as  having  similarities  with  the  current 
Israeli  program,  although  probably  more  comprehensive.  Such  a  section 
test  program  could  best  be  conducted  at  a  single  site,  and  candidates 
would  be  tested  during  an  estimated  seven  tc  ten  day  period. 

The  next  step  in  the  ACES  program,  after  the  implementation  of  the 
combat  effective  pilot  selection  test  battery,  is  the  peacetime  valida¬ 
tion  phase  (air  combat  maneuvering  range  performance  assessment  program). 
The  use  of  success  in  pilot  training  as  a  criterion  for  pilot  selection 
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has  proven  unsatisfactory  as  a  means  of  identifying  the  combat  effective 
pilot.  While  successfully  completing  pilot  training  is  a  necessary  condi¬ 
tion  for  becoming  a  pilot,  it  is  not  a  sufficient  condition  for  becoming 
combat  effective.  A  performance  criterion  having  a  stronger  relationship 
to  combat  success  is  needed  to  properly  validate  the  test  battery.  Evi¬ 
dence  that  tests  which  predict  success  in  training  also  predict  success 
in  the  operational  environment  has  not  been  found.  Figure  3  shows  corre¬ 
lations  between  three  predictors  and  two  criteria,  passing  advanced  flight 
training  and  the  number  of  combat  kills.  Although  the  data  are  from 
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FIGURE  3  TRAINING  CRITHUA  VBtSUS  OPERATIONAL  CRITBUA 


different  studies,  upon  inspection  they  suggest  that  we.  at  least,  care¬ 
fully  evaluate  the  utility  of  using  only  training  criteria  for  validating 
our  selection  instruments. 

Once  this  phase  of  the  peacetime  program  is  completed,  the  pilot's 
combat  effectiveness  scores  would  be  correlated  with  the  ACES  selection 
test  ornn»-a«*  scores,  and  the  determination  made  concerning  the  predictive 
power  of  the  ACES  selection  test  program.  If,  upon  analysis,  adequate 
correlations  exist  between  some  of  the  candidate  pilot's  selection  test 
scores  and  his  performance  during  dissimilar  air  combat  testing,  then 
there  will  exist,  during  peacetime,  a  way  of  selecting  pilots  who  will 
Perform  adequately  in  a  necessary  condition  for  combat  success.  Com¬ 
bined  w<th  appropriately  small  selection  ratios,  the  probability  of 
selecting  combat  effective  pilots  will  be  greatly  enhanced. 

Finally,  we  have  prepared  an  ACES  combat  contingency  valuation  plan, 
for  ultimately,  should  the  situation  eventuate,  one  could  relate  the 
pilot's  scores  in  both  the  ACES  selection  test  program  and  the  ACES  pilot 
air  combat  maneuvering  range  performance  asses^nent  program  with  perform¬ 
ance  in  actual  combat.  The  combat  contingency  validation  plan  includes 
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a  specification  of  the  combat  data  required  to  make  a  quantitative  assess¬ 
ment  of  the  combat  effectiveness  of  individual  pilots,  uncontaminated  by 
such  issues  as  opportunity.  We  have  used  the  Strawbridge  and  Kahn  (1955) 
study  of  ccmbat  effectiveness  in  the  air  war  in  Korea  as  a  model  for 
developing  the  data  requirements  for  a  scientific  and  rigorous  combat 
data  collection  program.  Critical  combat  data  requirements  identified  by 
those  authors  include  missions,  sightings  as  leaders,  firings,  and 
weighted  kills. 

While  the  overall  ACES  program  is  ambitious,  it  is,  in  our  opinion, 
quite  feasible  and  is  potentially  a  very  high  payoff  program.  The  nature 
and  size  of  the  effort  required  to  accomplish  this  job  is  such  that  it 
will  clearly  retire  high  level  endorsement  and  sponsorship. 

The  ACES  combat  contingency  validation  program  which  we  have  sketched 
out  here  is  clearly  provisional,  However,  It  does  supply  a  usable  depar¬ 
ture  point  for  a  more  carefully  contrived  plan.  Air  Force  efforts,  both 
in  World  War  11  and,  particularly  the  Korean  conflict,  were  well  conceived 
and  executed.  Indeed,  they  form  a  large  basis  of  what  we  can  say  factu¬ 
ally  about  the  factors  contributing  to  air  combat  effectiveness.  However, 
the  researchers,  through  no  real  fault  of  their  own,  had  only  very  limited 
relevant  information  on  the  pilots  prior  to  their  entry  into  combat. 

!f  the  ACES  program  recommended  in  this  report  is  implemented, 
research  scientists  will  be  armed  with  selection  test  scores  for  virtually 
all  of  the  dimensions  presumed  to  underlie  fighter  pilot  combat  effective¬ 
ness.  Armed  with  these  test  scores  and  an  effectively  executed  combat 
data  acquisition  validation  proqram,  the  military  aviation  research  can- 
muni  ty  should  be  able  to  select  those  persons  who  are  most  likely  to  be 
combat  successful  fighter  pilots. 
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initial  Development  of  the  Organizational  Assessment  Package 

William  H.  Hendrix 
Air  Force  Hunan  Resources  laboratory 
Brooks  Air  Force  Base,  Texas 


i NTRODUCT 1  OH 

Organixat ional  effectiveness  is  an  area  of  vital  concern  to  the 
Air  force,  it  is  an  area  that  can  not  only  result  in  financial  loss 
but  loss  of  human  resources.  Host  Air  Force  personnel  can  recall  personal 
experiences  where  they  have  witnessed  organizat tens  which  were  obviously 
inefficient  financially  as  well  as  having  morale  and  productivity  problems. 

The  problem  addressed  in  this  paper  is  how  does  one  effectively  model 
organizational  effectiveness  and,  In  turn,  measure  it.  Toward  that  end, 
a  Three  Component  Organ! zat ional  Effectiveness  Mode!  is  presented  and 
tne  data  collection  instrument  package  which  is  based  on  the  model  is 
described.  The  Instrument  package  is  entitled  the  Organ! zat Ional  Assess* 
ment  Package  (OAP). 


BACKGROUND 


The  Organ! zat ional  Assessment  Package  (OAP)  is  being  developed  for 
use  by  the  Air  force’s  leadership  and  Management  Development  Center  (LMOC) 
Maxwell  AFB,  Alabama.  The  objectives  of  IMOC  include:  (a)  providing 
consultative  services  to  Air  Force  commanders,  (b)  provide  leadership 
and  Management  training  to  Air  force  personnel  in  their  work  environment, 
and  (c)  perform  research  in  support  of  (a)  and  (b)  above.  The  consultative 
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role  Involves  organ  I zat lone I  problem  area  identification  and  recommenda¬ 
tions  for  reducing  cr  eliminating  problems  identified.  The  OAP  is  being 
designed  to  meet  LMDC's  objectives.  First,  the  OAP  will  provide  a  means 
of  Identifying  existing  strengths  and  weaknesses  within  organizations. 
Second,  research  results  can  be  fed  back  into  their  Professional  Military 
Education,  other  leadership  and  management  training  courses,  Air  Staff, 
and  functional  Offices  of  Primary  Responsibility  (OPR's).  lestly, 
the  OAP  data  base  established  can  be  used  for  research  to  strengthen 
the  overall  Air  Force. 

THREE  COMPONENT  ORGANIZATIONAL  EFFECTI VENESS  MODEL 

The  Thtce  Component  Organ  I z«t i onal  Effectiveness  Model*  (c.f.  Figure  I) 
was  primarily  reported  by  Hendrix  (1976),  and  considered  Organizational 
Effectiveness  (E)  to  be  a  function  of:  the  criterion  selected  (c ) ; 
the  managerial  style  employed  (m) ;  and  the  situational  environment  (s), 
which  includes  the  manager's  subordinates,  peers,  and  other  personnel 
in  the  environment.  That  is:  E  ■  f(c,m,s). 

ORGANIZATIONAL  ASSESSMENT  PACKAGE 

The  Organ! zat ionai  Assessment  Package  (OAP)  is  designed  to  measure 
the  basic  components  of  the  Three  Component  Organizational  Effectiveness 
Model.  As  can  be  noted  in  Figure  i,  the  Supervisory  Job  Inventory  (SJI) 

'in  Hendri*  (1976)  the  model  was  initially  entitled  the  Three  Component 
Leadership  Effectiveness  Mode!  and  has  since  been  expanded  to  focus 
on  the  entire  organ i zat Ion . 
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Is  designed  to  measure  managerial  style  (m) ,  while  the  • ;! tuatlonal  environ¬ 
ment  (s)  h  measured  by  two  sections  of  the  OAP,  the  B«  kground  Information 
section  and  the  Organ izat local  Job  Inventory  (GJI).  The  criteria  are 
satisfaction,  organizational  climate,  and  perceived  productivity.  These 
are  measured  by  the  sections  entitled;  Job  Satisfaction  Questionnaire 
(JSQ) ,  Organizational  Climate  Inventory  (OCl),  and  Perceived  Productivity 
Index  (PPl).  Hard  data  when  available  will  be  collected  separately  and 
merged  with  the  OAP  data  base. 

CAP  FACTORS 

'terns  within  each  of  the  OAP  sections  have  been  written  to  measure 
certain  factors.  The  Background  Information  section  contains  biographical 
Information  Items  and  I  terns  associated  with  factors  in  the  situational 
environment.  The  factors  In  t*«e  situational  environment  which  the  items 
attempt  to  measure  Include:  (a)  organ  I zat lonal  level  of  work  group,  (b) 
work  group  type,  (c)  work  group  size,  (d )  group  member  maturity,  (c) 
Organl zat ion's  geographic  region,  (f)  exter  t  to  which  work  group  meetings 
are  used  to  establish  goals,  (p)  extent  of  communication  between  work  group 
members,  and  (h)  stability  of  work  hours.  In  addition,  the  situational 
environment  is  in  part  measured  by  the  Oryeni zat 1 onai  Job  Inventorv  (OJl). 
The  factors  Included  in  the  OJl  are  based,  in  the  main,  on  the  job  enrich¬ 
ment  model  proposed  by  Hackman,  Oldham,  Jansen,  and  Purdy  (1975).  They 
proposed  five  basic  factors  which  they  called  Core  Job  Dimensions,  Those 
were:  (a)  skill  variety,  (fa }  task  Identity,  (c)  task  significance, 


(d)  autonomy,  and  (a)  feedback  from  the  Job.  These  factors  are  to  be  measured 
by  the  OJI  plus  one  additional  work  related  factor  which  is  labeled  Work 
interfe rence.  This  factor  deals  with  the  extent  and  adequacy  of:  (a) 
additional  duties,  (b)  equipment  and  supplies,  and  (c)  provided  work  space. 

In  the  criterion  area,  organizational  climate  Is  measured  by  the 
Organixat tonal  Climate  inventory  which  Includes  the  factors  of:  (a) 
Communications,  (b)  general  organizational  conditions,  (c)  employee 
concern,  (d)  employee  commitment,  (e)  decision  making,  and  (f)  recognition. 

Another  criterion  area  is  that  of  Job  satisfaction  which  Is  measured 
by  the  Job  Satisfaction  Questionnaire  (JSQ) .  This  questionnaire  contains 
30  items  whicrj  are  descriptions  of  30  factors  out  of  35  factors  Isolated 
by  Gould  (1975)  in  an  unpublished  study.  The  methodology  and  items  used 
to  isolate  the  factors  can  be  found  in  Tuttle,  Gould  and  Hazel  (1975). 

The  30  factors  are  listed  in  Table  I. 

The  last  criterion  Is  perceived  productivity  and  is  measured  by  I*  items 
contained  within  the  Perceived  Productivity  Index  section.  The  items  measure 
perceived  productivity  in  terms  of  tne  work  group's:  (a)  quantity  of  work 
output,  (b)  quality  of  work  output,  (c)  performance  when  high  priority 
work  arises,  and  (d)  whether  flow  of  work  to  or  from  the  work  group  is  impaired. 

The  Supervisory  Job  Inventory  (SJI)  consists  of  81  items  relating  to 
supervisory  behavior.  Once  an  adequate  sample  has  been  obtained  these  items 
will  be  factor  analyzed  and  the  resulting  factors  will  be  used  to  depict 
differing  managerial  behaviors. 

PROGRESS 

A  small  scale  study  (n  *  lMi)  was  conducted  at  Lackland  Air  Force  Base 


during  May  1977.  One  purpose  of  the  study  was  to  collect  critique  informa¬ 
tion  on  the  OAP  in  order  to  improve  it.  in  addition,  the  data  served  to 
provide  an  initial  base  line  in  terms  of  means  and  standard  deviation  for 
each  item  on  the  OAP.  An  intercorrelat ion  matrix  consisting  of  the  OAP 
item  variables  plus  a  series  of  compound  variables  generated  from  the 
original  variables,  was  used  to:  (a)  delete  I  terns  which  did  not  intercorrelate 
well  with  the  stated  factors,  and  (b}  establish  simple  correlational 
relationships  between  variables  in  the  situational  environment  and  managerial 
area  with  criteria  items.  The  OAP  previously  described  Is  the  result  of 
revisions  based  on  data  collected  from  the  Lackland  study.  The  major 
modification  was  the  deletion  of  the  Job  Diagnostic  Survey  (JDS)  (Hackman, 
et  al ,  1975)  from  the  instrument  package,  with  the  OJI  being  used  instead 
to  establish  the  job  enrichment  variable  values.  The  reason  fo deleting 
the  JDS  instead  of  the  OJI  was  to  reduce  the  total  pages  in  the  OaP 
(i.e.  the  JDS  is  approximately  7  pages  and  the  OJI  is  2  pages)  and  to 
have  the  format  of  the  instrument  the  same  as  that  of  the  other  instruments 
within  the  OAP.  The  JDS  is  an  excellent  instrument  and  if  the  OJI  indicates 
a  job  enrichment  problem  exists  within  an  organization,  then  a  more 
thorough  examination  could  be  accomplished  using  the  JDS.  Table  2  lists 
the  intercorrelations  between  the  job  enrichment  factors  on  the  JDS 
with  their  counterpart  on  the  OJI.  Table  3  presents  the  intercorrelation 
of  selected  criterion  items  with  the  situational  variables  of:  (a)  a  total 
score  across  items  on  the  OJI  (OJI  Total),  (b)  the  Motivation  Potential 
Score  (MPS)  and  Growth  Need  Score  (GNS)  as  defined  by  Hackman  et  al,  (1975), 

(c)  the  Need  for  Enrichment  Index  (NE I )  which  is  derived  for  the  OJI  and 
is  the  total  score  of  all  items  indicating  a  need  for  enrichment,  and  (d) 
the  Job  Motivation  Index  (JHl)  which  is  computed  with  the  same  formula  as 
the  G?«S. 
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DISCUSSION 


The  OAP  is  designed  to  provide  indicators  of  the  manager's  behavior, 
the  situational  environment  and  criteria  of  organizational  effectiveness. 
Should  problem  areas  be  identified  then  a  more  detailed  investigation  will 
be  performed  by  on  site  consultation  teams.  Once  validated  the  OAP 
should  provide  a  means  for:  (a)  identifying  organizational  strengths  and 
weaknesses,  (b)  establishing  appropriate  managerial  behavior  in  different 
situations  with  different  criteria  of  success,  and  (c)  Identifying  and 
resolving  functional,  career  field,  or  systematic  Air  Force  problems. 
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TABI.E  1 


JOB  SAT  IS  FACT  I  ON  O'JKSTIONNAl  RK  FACTOR  ITEMS 


Additional  Dut  lea 

Equipment  and  Supplies 

Information  on  Policies  and  Procedure* 

Feeling  of  Helpfulness 

Control  of  Other*  (Non-Supervl*ory ) 

Character tstlcs  of  the  Local  Area 

Work  Space 

Social  Contact  (Other  than  Co-vorkers) 
Co-Worker  Relationships 
Family  Attitude  Toward  Job 
Independence  in  Work  Procedures 
Joo-Asaoc lated  Training 
Job  Hazards 

Moral  Acceptability  of  Job 
Self-Improvement  Opportunlt lea 


Temperature  of  Work  Environment 

Leave  Policies 

Work  Itself 

Work  Schedule 

Job  Security 

Safety  Programs 

Travel 

Acquired  Valuable  Skills 
Base  Facilities 

Br.se  Housing  and  Eating  Pnc little* 


Social  Contact  Opportunities 

I'hyslcal  Activity 

Verbal  and  Written  Communication 

Supervisor  Responsibilities 

Temporary  Duty  (TDT)  Costs  and  Conditions 
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TABLE  2 


CORRELATIONS  OF  JOB  ENklCHMENT  FACTORS  ON 
THE  JDS  WITH  THOSE  ON  TOE  ail 


FACTOR 


CORRELATION  COEFFICIENT 


Skill  Variety  .60 

Task  Identify  .60 

T  ah  k  S  t  gn  i  f  1  c  an  ce  .  6 4 

Au?o«:>wiv  .44 

lob  Feedb>_ k  .‘j9 
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TABLE  3 


SITUATIONAL  ENVIRONMENT  VARIABLE  PREDICTORS  OF  CRITERIA 


Situational 

CRITERIA 

Work  Satisfaction 

Variable 

Perceived 

1  2 

a 

Productivity 

3 

Climate 

i  * 

i  * 

OJI  Total 

,52 

.26 

.41 

.30 

-.12 

.26 

.22 

MPS 

.67 

.21 

.39 

.32 

f 

-.21 

.26 

.22 

(.NS 

•  If; 

.21 

.16 

.16 

.10 

.14 

.22 

SKI 

.21 

.16 

.19 

.16 

-.04 

.If? 

.16 

; 

.  56 

.32 

.44 

.3? 

-.17 

.30 

.25 

r«rc*ivo>S  Productivity 

1  »  f>» an t i t v  of  Work  Output 

2  »  1'iull'y  o!  Work  Output 

S  "  IVr f onsAoco  when  high  priority  work  arises 
4  »  KfM-'isM.rv  in  vurk  flow  free',  and  to  work  group . 
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1  «  von  are  proud  of  organ! 7 at  ion 
'  «  vou  foci  responsible  for  your  organ! ration 


QUALITY  OF  LIFE  IN  THE  U.  S.  AIR  FORCE : 
1977  vs.  1975 


Charles  W.  McNichols 
T.  Roger  Manley 
Michael  J.  Stahl 

Air  Force  Institute  of  Technology 


In  1975  the  authors  reported  on  the  development  and  initial  application 
of  a  nine  factor  model  desertbing  the  Quality  of  Air  Force  Life  (QOAFL}.1 
Initially,  the  model  was  constructed  to  provide  a  theoretical  framework  for 
the  Air  Force  Management  Improvement  Group's  (AFMIG's)  surveys  of  Air  Force 
military  members ,  civilian  employees,  spouses  of  military  members,  and  base 
commanders.  It  has  since  been  used  as  the  unifying  theme  for  an  Air  Force 
wide  survey  of  all  commanders  and  in  a  second  survey  of  Air  Force  military 
members,  resulting  in  a  total  data  base  which  includes  over  50,000  responses. 
Data  obtained  from  the  most  recent  military  survey  effort,  performed  in  the 
spring  of  1977,  offers  an  opportunity  to  examine  the  model  for  stability  and 
to  look  at  a  longitudinal  comparison  of  Quality  of  Air  Force  Life  perceptions. 
These  issues  will  be  exploded  in  this  paper,  along  with  some  more  detailed 
comparisons  within  the  1977  sample. 


THE  QOAFL  NODEL 

The  nine  dimensions  hypothesized  as  encompassing  the  scope  of  the 
Quality  of  Air  Force  life  are  listed  and  defined  in  Figure  1,  In  application, 
each  factor  is  presented  to  a  survey  respondent  along  with  its  definition  and 
a  pair  of  seven  point  scales  to  be  used  in  reporting  degree  of  importance 
and  satisfaction  associated  with  the  factor.  In  the  Air  Force  Quality  of 
Life  surveys,  each  of  the  nine  factors  was  followed  by  a  sequence  of  more 
detailed  survey  items  related  to  the  factor. 

In  the  earlier  (1975)  surveys,  an  importance  scale  with  seven  responses 
ranging  from  "Low  importance”  to  "High  Importance"  was  used.  Most  responses 
tended  to  cluster  toward  the  "High  Importance"  end  of  this  scale.  While  this 
result  reinforced  the  authors'  belief  tW:  factors  of  major  importance  had 
been  chosen,  the  importance  scale  was  not  very  useful  for  discrimination 
/..-'poses  Therefore,  in  the  1977  survey  of  military  personnel,  the  scale  was 
changed  to  range  from  "Moderate  Importance"  to  "Very  High  Importance"  as 
shown  in  Figure  2.  The  rescaling  resulted  in  the  hoped  for  increased  variance 
u.  the  importance  responses,  but  prohibits  meaningful  comparison  with  the 
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earlier  reported  importance  levels.  For  this  reason,  analysis  in  the 
remainder  of  the  paper  is  based  only  on  the  satisfaction  responses  for  each 
of  the  nine  dimensions. 


Figure  1 


QOAFL  FACTORS 

ECONOMIC  STANOARP:  Satisfaction  of  basic  human  needs  such  as 
food,  shelter,  clothing;  the  ability  to  maintain  an 
acceptable  standard  of  living. 

ECONOMIC  SECURITY:  Guaranteed  employment;  retirement  benefits; 
i ns u ranee ;  pro  tec  t i on  for  self  and  family. 

FREE  TIME:  Amount,  use.  and  scheduling  of  free  time  alone  or 
in  voluntary  associations  with  others;  variety  of 
activities  engaged  in. 

WORK:  Doing  work  that  is  personally  meaningful  and  important; 
pride  in  your  work;  job  satisfaction;  recognition  for  my 
efforts  and  my  accomplishments  on  the  job. 

LEAD£R$HIP/SUPERVISIOH:  Has  my  interests  and  that  of  the  Air 
Force  at  heart,  keeps  me  informed;  approachable  and 
helpful  rather  than  critical;  good  knowledge  of  the  job. 

EQUITY:  Equal  opportunity  in  the  Air  Force;  a  fair  chance  at 
promotion;  an  even  break  in  my  job/assignment  selections. 

PERSONAL  GROWTH:  To  be  able  to  develop  individual  capacities; 
educatTon/tra ining;  making  full  use  of  my  abilities;  the 
chance  to  further  my  potential. 

PERSONAL  STANDING:  To  be  treated  with  respect;  prestige; 
dTginTty ;"Teputa t i on ;  status. 

HEALTH:  Physical  and  mental  well-being  of  self  and  dependents; 
having  illnesses  and  ailments  detected,  diagnosed,  treated 
and  cured;  quality  and  quantity  of  health  care  services 
provided. 
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Figure  2 


QUESTIONNAIRE  EXAMPLE:  FREE  TIME 

Please  rate  the  degree  of  importance  of  free  time  to  you  and  your 
degree  of  satisfaction  with  it  based  on  the  following  description: 

FREE  TIME:  Amount,  use,  and  scheduling  of  free  time  alone,  or  in 
voluntary  associations  with  others;  variety  of  activities 
engaged  in. 

What  degree  of  importance  do  you  attach  to  the  above? 

A . 8 . C . 0 . E . F . G 

Moderate  High  Very  High 

Insportance  Importance  Importance 

To  what  degree  are  you  satisfied  with  the  FREE  TIME  aspects 
of  your  1 i fe? 

A . B . C . 0 . E . F . G 

Highly  Highly 

Dissatisfied  Neutral  Satisfied 

_ 


ANALYSIS 

In  this  section  of  the  paper  the  nine  factor  QOAF'l  model  will  be  examined 
from  a  factor  staoility  standpoint,  mean  satisfaction  levels  reported  in  the 
1975  and  1977  surveys  will  be  compared,  and  some  of  tne  differences  in  QOAFL 
satisfaction  levels  for  various  subsets  of  the  population  responding  to  the 
1977  survey  of  Air  Force  military  personnel  will  be  reported. 

Mode*  Stability 

Although  shifts  In  mean  satisfaction  levels  are  to  be  expected  over*  time, 
the  model  will  be  most  useful  for  longi tudina1  research  purposes  if  the 
correlation  structure  of  the  nine  factors  is  found  to  be  relatively  stable. 

To  test  the  model  for  this  factor  stability,  principal  component  analysis 
results  for  the  19?5;  and  1977  surveys  were  compared.  Table  1  summarizes 
the  two  analyses.  In  both  cases,  two  strong  factors  can  be  identified, 
explaining  slightly  over  50  percent  of  the  total  variance.  The  factor  loadings 
after  varimax  rotation  suggest  similar  factor  interpretations  derived  from  the 
'  o  sets  of  data:  a  general  measure  of  satisfaction  with  the  work  situation 
,  t  no  first  factor,  and  a  measure  of  satisfaction  with  economic  aspects  of 
:  *  fe  as  Air  Force  member  as  the  second. 
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TABLE  I 


wzl  >*•***  n~  <*rt t^as  w*% 

Principal  Component  Analysis: 
Cum  X  of 


Factor 

Eigenvalue 

Variance 

1 

3.60 

40.0 

2 

1.04 

51.3 

3 

.81 

60.6 

4 

.77 

69.1 

5 

.65 

76.3 

6 

.62 

8.1. 1 

7 

.56 

89.4 

8 

.51 

95.0 

9 

.45 

100.0 

Principal  Component  Analysi 

Factor 

Eigenvalue 

Cum  i  of 
Variance 

1 

3.50 

38.9 

2 

1.05 

50.6 

3 

.83 

59.8 

4 

.80 

68.7 

5 

.66 

76.1 

6 

.64 

83.2 

7 

.56 

89.4 

8 

.49 

94.  g 

9 

.46 

i-iTr  n*m.'ir..tm.-,mm(irrn<<.  \t*?nx*>  m&msr-'»rx  ■-^^Mi.^rn^-.-.*  nyrii«w>  1 1  <•?  wo 

1975  Military  Survey  fl  3  10,996 


Factor  Loadings  After  Rotation 
Dimension  "Tactor  l  factor  2 


Economic  Standard 

.16 

.77 

Economic  Security 

.10 

.82 

Free  Time 

38 

.41 

Work 

.77 

Leadership 

.71 

•  14 

Euui ty 

.54 

Personal  Growth 

.73 

Personal  Standing 

.73 

.26 

Health 

.25 

.47 

1977  Military  Survey  N  -  10,687 

factor  Loadings  After  Rotation 

Dimension 

Factor  1 

Factor 

Economic  Standard 

.15 

Economic  Security 

.12 

.82 

Free  Time 

.50 

.20 

Work 

.72 

.15 

Leadership 

.70 

.01 

Equi ty 

.54 

.38 

Personal  Growth 

.71 

.28 

Personal  Standing 

.71 

.25 

Hea 1 th 

.31 

.45 

Comparison  of  1975  and  1977  Satisfaction  Levels 

Althouqh  the  correlation  structure  of  the  QOAFL  satisfaction  scores  did 
not  change  significantly  between  1975  and  1977,  there  were  some  shifts  in 
means  for  particular  factors.  In  Figure  3  a  profile  diagram  has  been  used  to 
illustrate  the  direction  and  relative  magnitude  of  these  shifts.  The  diagram 
represents  the  mean  scores  for  the  overall  sets  of  respondents  to  the  two 
surveys  for  each  of  the  nine  factors.  As  can  be-  seen  from  this  diagram,  the 
largest  shifts  in  satisfaction  levels  occurred  in  the  Economic  Standard, 
Equity,  and  Economic  Security  dimensions,  and  were  in  the  direction  of  higher 
dissatisfaction.  All  other  shifts  were  small  in  magnitude,  with  slightly 
higher  mean  satisfaction  reported  with  Leadership/Supervision,  Personal 
Standing,  free  Time,  and  Health  aspects  of  respondents'  lives,  and  slightly 
lower  -van  satisfaction  reported  with  the  Work,  and  Personal  Growth  factors. 
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Figure  3 


Comparison  of  Mean  QGAfl  Satisfaction  levels  Reported  in  1975  and  1977 

MEAN  QOAFL  SATISFACTION 

Dissatisfied  Neutral  Satisfied 


Differences  Among  Subsets  of  1977  Respondents 

In  Figure  4  1977  mean  satisfaction  scores  for  officer  and  enlisted 
personnel  have  been  plotted  on  the  profile  diagrams.  Officer  personnel  report 
significantly  higher  satisfaction  with  Economic  Standard,  Work,  Personal  Growth 
and  Standing  than  enlisted  respondents,  but  mean  satisfaction  scores  on  all 
other  factors  are  quite  similar  for  *he  two  groups. 

As  a  final  example  of  comparative  QOAFL  satisfaction  levels,  the  mean 
scores  reported  in  the  1977  survey  by  first  term  Air  Force  personnel  have 
been  plotted  for  each  level  of  career  intent  reported  in  the  same  survey. 

The  subset  of  the  1977  sample,  in  this  case,  represents  all  enlisted  and  non- 
rated  officer  respondents  with  less  than  four  years  service,  and  rated  officers 
with  less  than  six  years  service  (because  of  the  longer  service  obligation  for 
flight  training,  rateo  personnel  with  under  six  years  service  were  considered 
first  termers). 
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Figure  5  indicates  that,  while  higher  levels  of  career  intent  are  associ¬ 
ated  with  higher  levels  of  satisfaction  for  all  nine  factors.  Work  and  Economic 
Standard  appear  to  be  the  most  powerful  discriminators  of  career  intent  level. 


FIGURE  5 

QOAFL  Profiles  for  First  Term  AF  Personnel  by  Reported  Career  Intent  in  1977 


‘Definitely  do  not  intend  to  make  the  AF  a  career 
•'Most  likely  wi  11  not  make  the  Air  Force  a  career 
’Undecided 

s”->$t  likely  will  make  the  Air  Force  a  career 
'Definitely  intend  to  make  the  Af  a  career 
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DISCUSSION 


The  nine  factor  Air  force  Quality  of  Life  model  has  now  been  used  in  six 
major  Air  Force  surveys  performed  over  a  two  year  period.  While  there  have 
been  some  changes  in  mean  satisfaction  levels  for  the  nine  factors  during 
this  interval,  the  basic  correlation  structure  of  the  model  appears  virtually 
unchanged.  Use  of  the  1975  satisfaction  data  as  a  baseline,  and  continued 
use  of  the  model  as  a  framework  for  Air  Force  opinion  and  attitude  surveys 
seems  justified. 


NOTES 


1.  Manley,  T.  R.,  R.  A.  Gregory,  and  C.  W.  McNicho'is,  Quality  of  Life  in  the 
U.S.  Air  Force,  Proceedings,  1975  Military  Testing  Association  Conference, 
Tnd TanapoTTiT” I nd i ana ,  Sept.  1975. 

2.  McNichols,  C.  W.,  T.  R.  Manley  and  R.  A.  Gregory,  Measuring  the  Quality 
of  Life  of  Air  Force  Personnel .  Proceedings,  1976  Psychology  fn  the  ATr 
Fore e~Sympo slum ,  USAF  Academy ,  Colorado,  ii-10  April  1976. 
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Measuring  The  Quality  of  Navy  Life 
by 

Richard  J.  Orend.  Robert  N.  Gaines, 
Kenneth  W.  Stroad  and  Marsha  J.  Michaels 


INTRODUCTION 


Ultimately,  the  military's  interest  in  the  quality  of  life  reduces  to  two 
basic  questions:  (1)  will  improving  the  quality  of  life  bring  increased 
reenlistment  rates  and,  by  extension,  greater  enlistment  interest,  and 
(2)  how  is  improvement  in  the  quality  of  military  life  related  to  the  on- 
the-job  productivity  of  military  personnel?  If  it  can  be  shown  that 
significant  improvements  will  occur  in  these  arras  as  a  result  of 
changes  in  the  perceived  quality  of  military  life,  then  extensive  re¬ 
search  efforts  will  have  been  vindicated.  If,  however,  the  results  of 
these  efforts  are  simply  nice  to  know  information  and  "interesting” 
correlations,  the  resources  spent  on  this  research  might  be  put  to 
more  fruitful  uses.  O:  course,  the  eventual  achievement  of  goals  as 
ambitious  as  increasing  reenlistment  rates  and  productivity  requires 
the  cooperation  of  both  researchers  and  policy  makers,  since  the 
findings  of  any  research  efforts  must  be  translated  into  concrete 
policies  and  implemented  in  real  environments.  Thus,  resoarchers 
must  operate  within  the  constraints  of  feasible  policies  and  policy 
makers  must  be  willing  to  experiment  and  modify  some  traditional 
ideas  and  procedures  if  useful  results  are  to  be  forthcoming. 


Our  purpose  here  is  to  examine  efforts  to  develop  the  first  stages  of 
this  process,  namely,  the  measurement  of  the  quality  of  military  life. 
There  are  two  distinct  elements  to  this  development,  conceptual  and 
methodological.  Previous  efforts  to  develop  qualir  of  life  measures 
in  the  military  have  suffered  because  they  generally  ignored  the  con¬ 
ceptual  aspects  of  the  development  process.  The  most  important 
implications  of  this  omission  are  the  failure  to  treat  all  aspects  of  the 
quality  of  life  which  might  be  relevant  to  reenlistment  decisions  and 
productivity  and  the  absence  of  a  means  to  evaluate  the  lists  which 
were  developed.  Essentally,  there  was  no  basis  to  judge,  a  priori, 
the  inclusion  of  particular  elements  of  life  quality  and  there  was  no 
structure  to  serve  as  a  heuristic  by  which  additional  variables  or 
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dimensions  could  be  evaluated.  This  led  to  instruments  which  ex¬ 
cluded  a  large  number  of  potentiality  useful  variables  and  to  the  mea¬ 
surement  of  what  were  presumably  similar  concepts  with  rather 
divergent  indicators. 

Another  conceptual  problem  which  has  received  insufficient  attention 
is  the  dicif  ion  process  by  which  perceptions  of  military  life  are 
transformed  into  decisions  about  behavior.  Of  particular  importance 
there  are  questions  about  the  relationship  of  job  and  non-job  activities 
and  the  context  in  which  decisions  about  reenlistment  are  made.  That 
context  includes  the  alternative  courses  of  action  open  to  individuals, 
the  relative  improtance  of  each  of  the  factors  in  the  quality  of  military 
life,  expesiences  in  the  military,  and  the  fulfillment  of  expectations 
about  what  military  life  would  be  like.  Each  of  these  factors  can 
influence  an  individuals'  evaluation  of  military  life,  i.e, ,  its  quality, 
and  decisions  about  whether  to  remain  in  the  military. 

As  is  evident  from  the  foregoing  discussion  the  approach  we  follow  is 
very  broad  and  is  intended  to  include  all  factors  which  may  influence 
quality  of  life  perceptions.  This  approach  represents  our  initial 
attempt  to  identify  a  broad  range  of  variables  which  mav  influence 
the  behavior  of  military  personnel  and  to  examine  interactions  be¬ 
tween  perceptions  of  different  aspects  of  military  life,  and  between 
those  perceptions  and  the  context  in  which  they  arc  made,  Our 
particular  emphasis  o.t  all  elements  of  the  military  life  situation  does 
not  preclude  narrow  appioaches  which  focus  on  one  or  a  limited  num¬ 
ber  of  the  factors  which  we  feel  are  relevant  to  the  descussion  of  the 
quality  of  military  life.  *  In  the  following  discussion  an  initial  attempt 
on  developing  a  general  model  will  be  described. 

MEASURING  THE  QUALITY  OF  LIFE:  A  CONCEPTUAL  FRAMEWORK 

The  lessons  learned,  both  from  examining  the  theoretical  and 
methodological  issues  inherent  in  the  previous  research  and  from 
inspecting  actual  components  of  research  instruments  employed  in 


Work  by  David  Rowers,  which  focuses  on  the  job  related  aspects  of 
Navy  life,  is  an  example  of  the  more  restricted  approach  which 
has  produced  useful  results. 
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these  studies,  will  he  applied  in  the  following  to  the  construction  of  a 
conceptual  framework  applicable  to  the  measurement  of  the  quality 
of  Navy  life.  -  The  process  through  which  this  framework  will  be 
fashioned  involves:  (1)  establishing  a  theoretical  structure  which 
provides  a  rationale  of  life  quality  assessment:  (2)  identifying  a  set 
of  life  quality  factors  which  adds  substance  to  the  theoretical  struc¬ 
ture;  and  #3)  explaining  how  the  resultant  conceptual  framework 
satisfies  each  requirement  inherent  in  measuring  quality  of  life. 

A  Theoretical  Structure  for  Measuring  Quality  of  Life:  The 
theoretical  structure  offered  here  for  the  measurement  of  life  quality 
has  for  its  foundation  the  assertion  that  the  quality  of  an  individual's 
life  is  a  positive  function  of  the  degree  to  which  the  individual's  needs 
are  satisfied.  Thus,  if  nearly  all  of  an  individual's  needs  are  being 
met,  then  his  evaluation  or  expressed  satisfaction  with  the  quality  of 
that  life  will  be  very  high.  If  almost  none  of  his  needs  are  being  met, 
then  the  evaluation  of  his  life  quality  will  be  very  low. 

Based  on  the  assertion  above,  the  notion  :sf  quality  of  life  here 
receives  its  primary  structure  from  its  analysis  into  several  need 
categories.  While  a  number  of  perhaps  equally  informative  need 
taxonomies  exist.*  the  most  commonly  accepted  and  frequently  em¬ 
ployed  scheme  of  categorisation  is  that  proposed  try'  Maslow.  **  This 
analysis  will  follow  an  approach  adopted  by  several  other  quality  of 
life  studies  by  utilizing  catergories  which  relied  only  slight  deviation 
from  the  pattern  established  by  Maslow's  need  hierarchy.  *** 


On  this  point  sec  Arnold  Mitchell.  "Life  Ways  and  Life  Styles" 
(Menlow  Park,  CA:  Standard  Research  Institute,  1973),  p.  5. 

*♦ 

Sv  e  Abraham  H.  Maslow,  "Motivation  and  Personality"  {New 
York:  Harper  and  How,  Publishers,  Inc.,  1954),  pp.  80-98. 

Instances  of  studies  which  follow  Maslow's  categorization  of  needs 
include  Angus  Campbell,  "Aspiration,  Satisfaction,  and  Fulfillment, " 
The  Human  Meaning  of  Social  Change,  Ed.  Angus  Campbell  and 
Philip  E.  Converse  fN'tw  York:  Rusell  Sage  Foundation,  1972),  441- 
466,  and  Patricia  A.  PecoreP.a,  Predictors  of  Race  Discrimination 
in  the  Navy  (Ann  Arbor,  Mich:  Institute  for  Social  Research, 
University  of  Michigan,  1975). 
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The  four  categories  used  will  be  termed:  (1)  safety  and  comfort; 

(2)  belonging  and  love;  (3)  esteem;  and  (4)  self-actualization.  It  is 
with  respect  to  these  categories,  which  serve  as  sub-scales  of  life 
quality,  that  overall  quality  of  life  will  be  measured. 

Having  received  primary  structure  form  an  analysis  of  its  conceptual 
contents  into  need  categories,  the  notion  of  quality  of  life  achieves 
secondary  structure  when  these  categor  :a  are  themselves  analyzed 
to  reflect  the  logical  distinct:  -n  which  exists  between  '  quality  of 
life"  and  "quality  of  work."  This  distinction  is  based  on  the  premise 
that  some  different  factors  impinge  on  our  lives  in  work  and  nomvork 
situations  and  insofar  as  this  condition  exists  our  evaluation  of  these 
life  dimensions  and  the  context  in  which  they  operate  should  be 
separately  evaluated.  In  the  military  this  distinction  may  be  some¬ 
what  less  prononced  because  of  the  overall  control  exercised  on 
various  elements  of  behavior,  such  as  family  separation  and  living 
and  working  on  post  often  with  the  same  supervisors. 

The  result  of  this  secondary  analysis  is  a  conceptual  matrix  which 
permits  assessment  of  both  quality  of  life  and  quality  of  career  with 
respect  to  each  of  the  need  categories.  Table  I  provides  a 
general  representation  of  that  matrix. 

Factors  in  Quality  of  1  .i  fe/Work:  Furnished  above  has  been  a 
theoretical  structure  for  the  notion  of  life/work  quality  assessment. 
The  objective  now  is  to  supply  a  set  of  factors  which  may  be  utilized 
as  specific  measures  of  quality  of  life /work.  The  factors  may  be 
generated  by  tne.i||  of  the  following  procedure.  First,  the  component 
variables  from  each  of  the  civilian  and  military  related  quality  of 
life/work  studies  may  be  analyzed  on  the  basis  of  their  general 
content  and  logically  associated  into  groups  of  similar  variables. 

The  crucial,  concepts  common  to  groups  of  variables  were  then 
isolated  and  identified  as  preliminary  life/work  factors.  Next,  to 
this  preliminary  group  was  added  another  group  of  factors  discovered 
in  an  initial  analysis  of  the  need  categories  furnished  by  the  theore¬ 
tical  structure.  The  resultant  factor  set,  which  is  composed  of  39 
<slements,  is  illustrated  in  Table  II  along  with  corresponding 
variables  from  the  studies  treated  above. 

The  quality  of  life/work  factors  set  having  thus  been  presented,  an 
observation  with  respect  to  the  exhaustiveness  of  this  set  is  in  order. 
In  Table  II,  the  factor  set  not  only  exhausts  each  of  the  variables 
utilized  to  assess  quality  of  life/work  of  the  military  related 
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TABLE  I 


THE  CONCEPTUAL  FRAMEWORK 


LIFE  WORK 


SAFETY  AND 
COMFORT 

Health  and  Medical  Care 
Personal  safety 

Living  Essentials 

Local  Environmei  t 
Convenience 

- r 

Income 

Secure  Employment 
Retirement,  Medical,  and 
other  Fringe  Benefits 
Work  Environment 

Job  Convenience 

Sufficient  Resources  to 
Perform  Job 
Organisational  Climate 
Competence  o  f  Supervisor 

BELONGING 
AND  LOVE 

Contribution  to  Com¬ 
munity  and  Society 

Social  Life  and 
Relationships 

Relationships  with  Close 
Friends 

Relationships  with 

Nuclear  Family 

Interpersonal  Relation¬ 
ships  in  the  Work 
Environment 

Work  Related  Friend¬ 
ships 

Family  Disruption 

ESTEEM 

Self-Esteem 

Freedom  of  Choice  and 
Expression 

Equality 

Authority 

Responsibility 

Occupation  Related 

Prestige 

Freedom  to  Decide  How 
Work  Should  Be  Done 
Participation  in  Decisions 
Affecting  Own  Future 
Meaningful  Work 

Cognitive  Development 

Skill  Development 

Affective  Development 

Utilisation  of  Personal 

SELF 

Recreation 

Skills 

Travel 

Opportunity  for  Advance- 

ACTUALIZATION 

ment 

Advancement  on  the 

Basis  cf  Merit 

Interesting  Work 

Creative  Experience 
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TABLE  B 

FACTO** Or  QOAUTT  OF  Ufr/CAREEK 


Factor  Set 


Wileon,  rituitg, 
and  UnZaner 


Party  and  Siell 


Holt  end  Gitler 


Ho\tta,Uv)n|*lon, 
and  Swintyrn 


Survey 

Re.earch 

Center 


Health  e;id 
Medical  Ca*a 


Health  and  Per  - 
auaal  Salary 


Medical  Plana  h 
FrUt*  Beneflta 


Paraonal  Sale 

_!2 _ _ 


Paraonal  Rtytl- 
ca|  Salrty _ 


lain*  provided 
with  good  Medi¬ 
cal  aad  Dante! 
Care  ractltilce 


[Health 

‘Dear  done  Sob- 
•  lancet 


Llvtn*  Eaean- 
tiale 


jOe/ain*  a  Home 


lie  rin*  Hit'tnl 
Houeln*  and  prl 
vary  In  the  tier 
re  eke 

llavln*  food  quo l 
tty,  euffUlent 
quantity,  and 
proper  aervtca 
of  food, _ 


p<OU«if»f 

(cumvUtwl  A**« 
Privacy 

fcoentiai  Living 
Oort  a 


Local  fcnvir,.' 
m«n( 


Kroiytltm 
[Land  Uat 
Climate 
PoNe 

Water  dilution 
Air  dilution 


Coin  rnlfi.i  f 


Contribution  tof 
Comnutty  and 
See  lety 


ll*%ing  Faeijitle# 

Available  on  the 
Pwal  ihat  make 

LilS,L*Ji£S. _ 


IShlic  Tr*t»*y»oF- 

utlon 

T  ran^porUtion 
[Service# 


So<ul,  Commu^ 
it'*,  and  C  w\< 
t  Ac tlyiity* 


Involvement  in 
Community 
Life  _ 


[Opportunity  to 
mtbe  a  Lattlng 
Contribution  to 
So ciety 


Social  Life  an- 
RrUliomhlpt 


Relation*  with 
l*»rent*,  $tb« 
Singt,  or  other 
Relative# 


I  .Socialising  .. 


Relatione  with 
Nuclear  Tam* 

Ur 


Relation*  with 
Spouse  for  firl* 
hoy- 

.  iMcn»l(*l. _ 


Having  anti  Rais  ■ 
^  jng  C  MMren 


ReUivon#  nth 
Clotr  f  fiend* 


R  elation*  with 
Close  Friend* 


Primary  Social 
Relationship* 


[Secondary  Social 
[Relationships 


TABLE  U 
(CoattauW) 


factor  Sal 


latami 


Wtlioa,  fUnafaa, 

ami  Uhtaaer  fatty  ami  Shall  Hol- 


Malarlti  Wall- 
Bailor  a«4  Security 
tor  tha  fuCure 


Hauaa, 

GtUar  LlvInjaloB, 


Surrey 

Raaearch 


iz  wt'Zii  niro  i.mKm  *«c7i\rra 


CllmM# 


- 

Mb 


ftirsr!*?* _ 

KsiiliAAiHipt 
I*  |K»  Wort 


1  Atalhr,  of 

iu>  ifef  oifu  *r  • 
Soft 

C-4»mm|  •  iloniv! 

om*  •»  * 

>■  ^  »  0*»i  t  }Qt>$ 

*  *»<  b  <*i  }  *»*.f  Jy 

_ 

OxkJ  |M»f 
(v*  t  •'vrvfti  iwu 
*llh 

jV/fvff-ri  *of  * 

Ooftd  tMftf 
V>*f  H  r  l» 

U«MKlp* 

Goo4  iM«f 
fitttoftsl  B«l« 
»11h 

Swbor‘?''*vi,l*  • 


fltHI  3*1 


P*Wy  •»<  »%*tl  IHoU  vod  QtUcr 


_ i  issisi 


IQB9H 

FT 

ICnnBBras 

|M|| 

«tlk 

Ckcu^tioMl  Kc)t 

tJob* 

M*Vint  tfc. 

Wdrt  U>tMi(|  - 
ft* 1  **4  Worth- 
»Ml«  »M 
latmlMOn*  th* 
lu*y  Work 

1  ttn  •••  th« 
Itnlt*  of  my 
Work;  TV* 
Prekltmt  1  im 

•  *k*4  to  jalv* 

•  r*  k»rd 

MKKUh 

R.*|k>**IMU - 

•r 

A ttf 

l’*f  »an«t 
MffQKi'tHIHy 

My  ft*#pontl» 
Mlltif*  ire 
cic-trly 

ft  n*M< 

AwlVorlty 

C^KKirtltMlV  to 

Control  «M 
Direct  Othrr# 

t  h*V*  *RdU|V 
Authority  to  do 
trs?  )oV 
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studies,  but  also  includes  80  percent  of  the  variables  employed  to 
measure  quality  of  life  and  quality  of  career  in  the  civilian  related 
studies.  In  this  way,  the  factor  set  displays  a  clear  superiority  of 
extension  over  the  various  sets  of  life/work  quality  variables  used 
in  the  military  related  studies,  and  demonstrates  a  coverage  of  the 
variables  critical  to  quality  of  life  and  quality  of  work  measurement 
which  is  roughly  equivalent  to  the  more  specialised  civilian  related 
studies.  Second,  despite  the  degree  to  which  the  factor  set  exhausts 
variables  relevant  ».?  the  assessment  of  quality  of  life/work,  it  must 
be  considered  a  provisional  set.  This  is  because  certain  factors  may 
be  added  or  subtracted  from  the  set  based  on  the  results  of  empirical 
investigation,  conclusions  derived  from  logical  inspection  of  the 
theoretical  structure,  or  specific  research  requirements. 

MEASURING  NAVY  QUALITY  of  LIFE/WORK 
A  RESEARCH  DESIGN 


Genera:  Approach 

The  foregoing  analysis  provides  a  basic  model  for  the  study  of  the 
quality  of  li'e/work  in  any  context.  In  the  proposed  model  we  focus 
on  the  satisfaction  of  individuals  with  their  nv,!t*ry  (Navy)  lives,  in 
both  life  and  career  situations.  An  analysis  built  on  this  framework 
can  provide  the  basis  for  a  relatively  easy  to  admin, ater  general  test 
for  use  with  Navy  personnel. 

The  focus  of  this  discussion  is  both  substantive  and  methodological. 
Substantively,  we  seek  to  specify  some  of  the  major  problems  con¬ 
fronting  the  Navy  in  terms  of  general  satisfaction  of  personnel.  Our 
concern  is  to  first  identify  the  general  factors  which  comprise  the 
total  life  space  of  Navy  personnel,  then  to  determine  which  of  those 
factors  is  most  closely  associated  with  behavoral  decisions,  specific¬ 
ally  the  decision  to  rccnlist. 

Comparative  Analysis 

One  of  the  most  important  methodological  considerations  '.n  this 
research  will  be  the  use  of  comparison.  That  it,  r:  -..ant  to  analy7C 
satisfaction  not  just  with  the  Navy  per  se,  but  in  comparison  with 
what  is  expected  in  the  civilian  world,  the  standard  against  which 
individuals  will  be  evaluating  Navy  life.  Certain  aspects  of  the  Navy, 
e.  g.  pay.  may  displease  everyone,  but  the  relevance  of  a  particular 
perception  becomes  important  only  when  there  is  an  alternative  which 
is  perceived  as  both  better  and  available.  Thus,  we  expect  to  be  able 
to  learn  more  about  rcenlistmcnt  decision*  from  a  comparison  of 
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Navy  and  civilian  alternatives  than  from  a  Navy  evaluation  alone. 

Other  Contextual  Factorg 

In  a  similar  vein,  each  of  the  other  contextual  considerations  men¬ 
tioned  previously  is  potentially  important  in  the  analysis  of  perceptions 
of  the  quality  of  Navy  (military!  life.  For  example,  a  difference  in 
the  perceived  ability  of  the  Navy  to  provide  free  choice  in  jobs  vs. 
civilian  choice  is  important  only  insofar  as  that  freedc.n  is  significat.. 
to  the  individual.  Another  more  popular  example  is  the  question  of 
hair  length.  Most  oi  the  young  men  in  the  Navy  feel  that  hair  length 
regulations  are  restrictive,  more  restrictive  than  in  civilian  life. 
However,  whether  or  not  thic  perception  is  important  in  a  reenlist¬ 
ment  decision  is  at  least  partially  a  function  of  how  important  hair 
length  is  to  the  individual.  We  shall  call  this  particular  contextual 
consideration  salience. 

Another  consideration  is  the  set  of  expectations  about  Navy  service 
enlistees  brought  with  them.  If  I  entered  the  Navy  expecting  to  fly 
airplanes  and  ended  up  chipping  paint,  it  seems  likely  that  1  would  be 
greatly  disatisfied  with  at  least  the  work  dimcns'ons  of  my  Navy 
career.  While  the  discrepency  may  not  be  that  large  in  most  cases, 
there  are  undoubtedly  many  instances  in  which  the  reality  of  Navy 
life  did  not  correspond  with  the  expectations.  At  a  minimum  we 
would  expect  that  such  considerations  would  color  evaluation  of  the 
Navy  in  the  specific  area  where  differences  occur.  They  could 
influence  Navy-c i viiian  comparisons  as  well. 

Still  another  part  of  the  decision  context  is  what  actual  experiences 
individuals  had  white  they  were  in  the  Navy.  By  experience  we  mean 
in  the  institutional  sense,  such  as  rating,  proportion  of  sea  duty,  and 
schooling,  rather  than  the  day-to-day  interaction*  with  peers  and 
supervisors.  The  latter  type  of  experience  will  be  reflected  in  the 
specific  variables  evaluated  by  each  individual  and  would  not 
necessartly  be  associated  with  such  general  characteristics  as  rating. 
The  former  experiences  are  related  to  the  constant  impact  of  being 
at  sea  or  working  a  particular  type  of  job.  While  the  previous  context 
factors  had  to  be  measured  and  analyzed  simultaneously  with  per¬ 
ceptions  of  quality  of  lift  variables,  these  experience*  can  be 
evalv.,ted  on  a  post-hoc  basis  by  dividing  respondents  into  groups 
which  exhibit  each  of  the  relevant  characteristics. 
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SUMMARY  OF  THE  CONCEPTUAL  MODEL 


The  foregoing  discussion  may  be  summarized  *»  follows: 

(1)  Behavior  of  Navy  personnel  with  regard  to  a  reenlistment 
decision  is  a  function  of  perceptions  of  Navy  life  modified  by  each 
individual^  comparison  of  each  variable  to  alternatives  in  civilian 

life  and  to  the  importance  of  that  variable  in  their  hierarchy  of  value*. 

(2)  The  variables  which  exhibit  potential  signifigance  in  these 
decisions  may  be  identified  through  the  use  of  a  needs  model  which 
specifies  the  areas  which  are  likely  to  be  important  to  various  groups 
of  Navy  personnel.  Such  a  model  helps  to  insure  the  comprehensive¬ 
ness  of  the  variable  list  and  a  systematic  balanced  approach. 

(3)  Beyond  theso  basic  considerations  are  such  factom  as 
expectations,  and  Navy  experience,  which  may  color  the  perceptions 
of  individuals  and  thereby  influence  reenlietmcnt  decisions. 

The  usefulness  of  quality  of  life  research  will  depend  on  our  ability 
to  account  for  each  of  these  factors  in  a  systematic  way.  By  sys¬ 
tematic  we  mean  to  evaluate  decisions  so  that  the  impact  of  each  of 
these  factors  can  be  identified  and  measured.  From  this  base  it  will 
be  possible  to  generate  policy  which  reflects  the  reasons  for  negative 
evaluations  of  the  Navy  And  the  precise  means  to  turn  such  evaluations 
(and  presumably  behavior)  around. 
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"MEASUREMENT  OF  LEARNED  BEHAVIORS  IN  COMPETENCY 
BASED  LEADERSHIP  TRAINING  PROGRAMS" 

A  PRESENTATION 
BEFORE  THE 

MILITARY  TESTING  ASSOCIATION 


BY 

DOROTHY  von  K.  PEPPER,  Ed.  D. 
WORTH  SCANLAND,  Ph.  D. 


For  many  years  the  U.  S.  Navy  has  been  addressing  the 

PROBLEM  OF  TRAINING  LEADERS,  AS  HAVE  MANY  OTHERS  IN  AND 
OUT  OF  THE  MILITARY  SERVICES,  THERE  ARE  THOSE  WHO  CONTEND 
THAT  LEADERS  ARE  BORN,  NOT  MADE,  AND  THERE  ARE  OTHERS  WHO 
TAKE  THE  POSITION  THAT  LEADERSHIP  IS  A  DEFINABLE  SKILL 
WHICH  CAN  BE  IDENTIFIED  AND  TAUGHT,  AND  THAT  THE  RESULTANT 
BEHAVIOR  CAN  BE  MEASURED,  TlHE  NaVY  HAS  BEEN  IN  BOTH  CAMPS 
AT  ONE  TIME  OR  ANOTHER,  BUT  AT  THE  PRESENT  TIME  HAS  ADOPT¬ 
ED  THE  POSITION  THAT  LEADERSHIP  CAN  BE  DEFINED  AS  A  LEARN¬ 
ED  BEHAVIOR,  SUBJECT  TO  IMPROVEMENT  THROUGH  TRAINING,  AND 
THAT  THESE  LEARNED  SKILLS  ARE  MEASUREABLE  IF  VIEWED  AS 
COMPETENCIES.  HAVING  REACHED  THAT  CONCLUSION,  THE  NAVY , 
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AND  SPECIFICALLY  THE  CHIEF  OF  NAVAL  PERSONNEL/  HAS 
COMMISSIONED  McBER  AND  COMPANY  OF  BOSTON/  MASSACHUSETTS/  ’ 
TO  DEVELOP  A  COMPETENCY  BASED  LEADERSHIP  TRAINING  CURRIC¬ 
ULUM  WHICH  CAN  BE  DELIVERED  TO  BOTH  ENLISTED  AND  COMMIS¬ 
SIONED  PERSONNEL  AT  SEVERAL  LEVELS  OF  SENIORITY/  PRIMARI¬ 
LY  AT  SCHOOLS  ASSOCIATED  WITH  ENTRY  POINTS  INTO  THE  NAVY 
OR  INTO  HIGHER  LEVELS  OF  RESPONSIBILITY,  h  IS  THE  PUR¬ 
POSE  OF  THIS  PAPER  TO  DESCRIBE  THE  METHOD  BY  WHICH  THIS 
TRAINING  PROGRAM  IS  BEING  DEVELOPED/  THE  MCANS  BY  WHICH 
THE  LEARNED  SKILLS  ARE  TO  BE  MEASURED/  AND  THE  CONTRAST 
BETWEEN  THIS  AND  MORE  TRADITIONAL  APPROACHES  TO  TRAINING 
PROGRAM  DEVELOPMENT, 

IN  A  PAPER  APPEARING  IN  THE  JANUARY,  1973,  ISSUE  OF 
the  American  Psychologist,  Dr.  David  C.  McClelland  of 
Harvard  University  took  umbridge  at  the  concept  so  prev¬ 
alent  THEN  AS  WELL  AS  NOW,  THAT  INTELLIGENCE  AND  APTITUDE 
TESTS  ADEQUATELY  MEASURED  CAPABILITIES  IN  PEOPLE  TO  PER¬ 
FORM  CERTAIN  TASKS  OR  J03S,  H I S  THESIS  WAS  THAT  SUCH 
TESTS  MEASURED,  IF  ANYTHING,  CAPABILITIES  TO  PERFORM  IN 
AN  ACADEMIC  SETTING  AT  ACADEMIC  SKILLS,  THEY  DID  NOTHING, 
HE  CONTENDED,  TO  MEASURE  THE  ABILITY  OF  A  POLICE  CANDIDATE 
TO  PERFORM  POLICEMEN'S  TASKS,  FOP  EXAMPLE.  He  THEN  WENT 
ON  TO  DESCRIBE  THE  CONCEPT  THAT  ONLY  RANDOMLY  SELECTED 
SKILLS  REQUIRED  OF  PERSONS  WHEN  PERFORMING  THE  DESIRED  JOB 
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COULD  BE  USED  AS  THE  BASIS  FOR  A  TEST  OF  APTITUDE  FOR 
THAT  JOB,  HE  CALLED  THIS  METHOD  "CRITERION  SAMPLING*, 
AND  IT  FORMS  THE  DATA  UPON  WHICH  VALID  PREDICTORS  OF 
FUTURE  JOB  COMPETENCY  CAN  BE  ESTABLISHED.  WHEN  TESTS 
FOR  THE  PREDICTION  OF  FUTURE  SUCCESS  ON  THE  JOB  HAVE 
BEEN  DEVELOPED  FROM  SUCH  DATA,  THE  VALIDITY  COEFFI¬ 
CIENTS  HAVE  RISEN  FROM  AN  AVERAGE  OF  0,33  FOUND  IN  THE 
LITERATURE  ON  LEADERSHIP  AND  MANAGEMENT  SKILL  CHARAC¬ 
TERISTICS  TO  A  MEAN  OF  0,60,  INASMUCH  AS  THE  SQUARE  OF 
A  CORRELATION  COEFFICIENT  YIELDS  THE  PREDICTIVE  VALIDI¬ 
TY  OF  A  MEASURE,  THE  NEW  JOB  COMPETENCY  ASSESSMENT  RE¬ 
SULTS  IN  A  THREE-FOLD  IMPROVEMENT  IN  THE  PREDICTIVE 
QUALITY  OF  THE  MEASURE  OVER  PREVIOUS,  TRADITIONAL  MEANS, 

The  Navy  has  therefore  chosen  to  develop  such  a  set  of 

MEASURES  TO  DETERMINE  THE  LEADERSHIP  AND  MANAGEMENT 
QUALITIES  OF  ITS  NON-COMMISSIONED  AND  COMMISSIONED 
OFFICERS.  This  IS  PROBABLY  an  APPROPRIATE  TIME  TO  STATE 
A  DEFINITION  OF  "COMPETENCY*  AS  USED  IN  THIS  DISCUSSION, 
FOR  IT  WILL  APPEAR  MANY  TIMES.  "COMPETENCY*  IS  USED  IN 
GOAL  OR  OUTCOME  OPIENTED  TRAINING  TO  IMPLY  THE  KNOWLEDGE 
SKILLS,  ABILITY,  MOTIVES  OR  OTHER  CHARACTERISTICS  THAT 
CAN  BE  DEMONSTRATED  TO  RELATE  DIRECTLY  TO  COMPETENT  OCCU 
PATIONAl  PERFORMANCE,  In  THE  PROGRAM  NOW  UNDER  DISCUS¬ 
SION  THE  ASSESSMENT  OF  THESE  JOB  RELATED  COMPETENCIES 
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AMONG  JOB  INCUMBENTS  IS  BOTH  THE  SOURCE  OF  THE  DETERMINA¬ 
TION  OF  THE  COMPETENCIES  RELATED  TO  SUPERIOR  LEADERSHIP 
AND  MANAGEMENT  AND  THE  BASIS  FOR  THE  MEASURES  WHICH  SUB¬ 
SEQUENTLY  ARE  UTILIZED  TO  DETERMINE  EACH  STUDENT'S  PRE" 

AND  POST-TRAINING  STATE, 

The  job  competency  assessment  procedure  may  be  de¬ 
scribed  IN  THREE  STEPS,  AS  FOLLOWS: 

1.  THE IDENTIFICATION OF  "SUPERIOR*  AND  "AVERAGE” 

CRLUEAlflti ..SAMPLES  OF . NAVY,  LEADERS »  In  ORDER  TO  ACCOMPLISH 

THIS  STEP,  A  SELECTED  GROUP  OF  COMMANDING  OFFICERS  OF  FLEET 
UNITS  WERE  ASKED  TO  IDENTIFY  COMMISSIONED  AND  NON-COMMIS¬ 
SIONED  OFFICERS  ABOARD  THEIR  SHIPS  WHO  COULD  BE  PLACED  IN 
EITHER  A  "SUPERIOR  PERFORMER"  CATEGORY  OR  AN  "AVERAGE  PER¬ 
FORMER"  CATEGORY, 

2,  The  CONDUCT  QE  "BEHA.Y.I,QRAL...£YEfiT  1 .  JNIERYJL£IiSi  > 
The  OFFICERS  AND  PETTY  OFFICERS  IN  THE  SAMPLES  TAKEN  IN 
THE  FIRST  STEP  WERE  ASKED  TO  DESCRIBE  IN  BEHAV I  ORALLY 
SPECIFIC  TERMS  CRITICAL  LEADERSHIP  INCIDENTS  IN  WHICH  THEY 
HAD  BEEN  INVOLVED.  The  TECHNIQUE  OF  BEHAVIORAL  EVENT 
INTERVIEWING,  DEVELOPED  BY  Dr.  iMcClELLAND  OF  THE  McBER 

Company,  involves  obtaining  a  number  of  descriptions  cr 

WHAT  HE  CALLS  "BEHAVIORAL  EPISODES."  For  EXAMPLE,  A  PERSON 
MIGHT  BE  ASKED  TO  DESCRIBE  AN  INCIDENT  IN  WHICH  HE  FELT 
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PARTICULARLY  SUCCESSFUL  (OR  UNSUCCESSFUL)  AND  THEN  TO 

t 

DESCRIBE  IN  DETAIL  THE  EVENTS  LEADING  UP  TO  THE  INCIDENT; 
WHEN  AND  WHERE  IT  OCCURRED;  AND  HOW  HE  WAS  FEELING  AND 
REACTING  BEFORE;  DURING  AND  AFTER  THE  INCIDENT,  THE  DIS¬ 
TINGUISHING  ASPECT  OF  THIS  INTERVIEW  TECHNIQUE  IS  THAT  IT 
ELICITS  INFORMATION  FROM  WHICH  ACTUAL  BEHAVIORS  CAN  BE 
RECONSTRUCTED  INSTEAD  OF  ELICITING  INTERPRETATIONS  OR 
RECOLLECTIONS  OF  GENERAL  OUTCOMES, 

3.  iHEBAIIfc . CflMTENT.  ANALYSIS . QE-BEHAY  IQBAL.  lYEIili » 

Inasmuch  as  the  officers  and  petty  officers  interviewed 

WERE  SELECTED  ACCORDING  TO  RATINGS  BY  THEIR  COMMANDING 
OFFICERS;  AND  OTHER  CRITERIA;  IT  IS  POSSIBLE  TO  COMPARE  THE 
"SUPERIOR"  AND  "AVERAGE"  INTERVIEWERS  IN  TERMS  OF  THE  CON¬ 
TENT  OF  THEIR  BEHAVIORAL  EVENTS.  THIS  COMPARISON  PROCESS 
INVOLVES;  FIRST;  THE  IDENTIFICATION  OF  CHARACTERISTICS  OR 
THEMES  WHICH  CAN  BE  DRAWN  FROM  THE  RELATED  INCIDENTS  IN 
BOTH  GROUPS  OF  THE  SAMPLE.  THEN  IT  INVOLVES  THE  DESIGN  OF 
A  SCORING  SYSTEM  WHICH  WILL  RELIABLY  CREDIT  A  SET  OF  BE¬ 
HAVIORAL  EVENTS  FROM  AN  INTERVIEW  FOR  THE  PRESENCE  OF  THE 
THEMES  OR  CHARACTERISTICS,  THESE  THEMES  WHICH  ARE  PRESENT 
IN  THE  EVENTS  RELATED  BY  THE  "SUPERIOR"  GROUP  AND  NOT  PRE¬ 
SENT  IN  THOSE  RELATED  BY  THE  "AVERAGE"  GROUP  BECOME  THE 
COMPETENCY  CHARACTERISTICS  THAT  ARE  LIKELY  TO  LEAD  TO  HIGH 
PERFORMANCE.  ONCE  IDENTIFIED;  THESE  CHARACTER  ST  ICS  WERE 
SUBSEQUENTLY  VALIDATED  THROUGH  TWO  DIFFER "NT  MEANS. 
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The  thematic  content  analysis  from  approximately  eight 

HUNDRED  CRITICAL  LEADERSHIP  INCIDENTS  COLLECTED  FROM 
FLEET  INTERVIEWS  PROVIDED  TWENTY-SEVEN  RELIABLY  DISTIN¬ 
GUISHABLE  LEADERSHIP  COMPETENCY  CHARACTERISTICS/  AND 
SUBSEQUENTLY  DISTILLED  DOWN  TO  FIVE  MAJOR  LEADERSHIP  AND 
MANAGEMENT  FUNCTIONS/  OR  FACTORS,  STATISTICAL  SUPPORT 
FOR  THE  DERIVATION  OF  THESE  FIVE  FACTORS  MAY  BE  FOUND  IN 
THE  McBER  AND  COMPANY  FINAL  REPORT  OF  THE  "ANALYSIS  OF 

Leadership  and  Management  Competencies  of  Commissioned 
and  Non-commissioned  Naval  Officers  in  the  Atlantic  and 
Pacific  Fleets,"  However,  because  these  factors  form 

THE  BASIS  FOR  BOTH  THE  DETERMINATION  OF  THE  3EHAVI0RS  TO 
BE  TAUGHT  IN  THE  TRAINING  PROGRAM  AS  WELL  AS  THE  MEASURES 
FOR  DETERMINING  THE  EXTENT  TO  WHICH  THE  STUDENTS  ACQUIRE 
SKILLS  AND  COMPETENCIES  THROUGH  THE  TRAINING,  IT  IS  IMPOR¬ 
TANT  TO  THIS  DISCUSSION  THAT  THEY  BE  DESCRIBED,  AND  TO 
SOME  EXTENT  DISCUSSED. 

The  twenty-seven  competency  categories,  as  divided  into 

THE  FIVE  DISTINCT  CONCEPTUAL  CLUSTERS  OR  FACTORS  ARE  AS 
FOLLOWS : 


FACTOR  I:  TASK  ACHIEVEMENT 


Navy  officers  reported  numerous  incidents  in  which 

THEY  EXPRESSED  CONCERN  FOR  ACHIEVEMENT,  SET  SPECIFIC 
GOALS,  TOOK  INITIATIVE  TO  SOLVE  TECHNICAL  PROBLEMS,  OR 
COACHED  OTHERS  TO  IMPROVE  THEIR  PERFORMANCE.  FlVE  COM¬ 
PETENCY  CATEGORIES  MAKE  UP  THIS  FACTOR: 

1*  Concern  for  AcmivEMEKi:  Officers  expressed 
a  desire  to  "do  jobs  right",  to  meet  standards  of  excel¬ 
lence,  AND  TO  ADVANCE  IN  THEIR  CAREERS,  AND  FELT  PROUD 
WHEN  THEY  HAD  DONE  A  JOB  WELL. 

2.  Takes  Initiative:  Officers  described  taking 

PERSONAL  INITIATIVE  TO  OVERCOME  OBSTACLES  IN  ACCOMPLISH¬ 
ING  TASKS. 


3.  Sets.  Goals.:  Officers  articulated  specific 

(OFTEN  MEASURABLE),  CHALLENGING  BUT  REALISTIC  AND  TIME- 
PHASED  GOALS  FOR  THEIR  OWN  PERFORMANCE  AND  THAT  OF  THEIR 
SUBORDINATES  AND  UNIT. 


4.  Coaches:  Officers  described  helping  subor¬ 
dinates  TO  ACCOMPLISH  TASKS  MORE  EFFECTIVELY  BY  PROVIDING 
INFORMATION,  "SHOWING  THEM  HOW",  OR  BY  ENCOURAGING  THEIR 
PERSONAL  DEVELOPMENT  THROUGH  TRAINING’OR  OTHER  ENRICHING 
EXPERIENCES. 
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5.  Technical  Problem  Solving:  Officers,  partic¬ 
ularly  ENLISTED  PERSONNEL  IN  TECHNICAL  RATES,  DESCRIBED  ' 
THINKING  ANALYTICALLY  IN  SOLVING  TECHNICAL  PROBLEMS:  OB¬ 
SERVING  DISCREPANCIES  IN  EQUIPMENT  PERFORMANCE  (PROBLEM 
FINDING),  REASONING  DEDUCTIVELY  TO  IDENTIFY  THE  CAUSES  OF 
MALFUNCTIONS,  LOCATING  NEEDED  RESOURCES,  ANTICIPATING 
OBSTACLES,  AND  ACTING  TO  CORRECT  PROBLEMS. 

FACTOR  II:  SKILLFUL  USE  OF  INFLUENCE 

Officers  described  being  concerned  with  influence 

STRATEGIES  —  PERSUASION,  EXPLANATION,  INSPIRATION,  RE¬ 
WARDS  —  TO  ACCOMPLISH  OBJECTIVES  AND  MOTIVATE  SUBORDINATES 
TO  WORK  AS  A  TEAM,  INFLUENCE  SKILL  IS  AIDED  BY  CONCEPTUAL 
THINKING  ABOUT  SHORT-  AND  LONG-RANGE  IMPACT  AND  BY  EMOTION¬ 
AL  SELF-CONTROL,  Fl VE  COMPETENCY  CATEGORIES  WERE  INCLUDED 
IN  THIS  FACTOR: 

1.  CQNCEfiIL.fflR-.MLUEM:£ i  OFFICERS  REPORTED 
BEING  CONCERNED  ABOUT  INFLUENCING  OTHERS  ("I  WANTED  TO 
CONVINCE  HIM"),  USING  THEIR  OWN  POWER  IN  INTERPERSONAL 
RELATIONS,  AND  BEING  SENSITIVE  TO  THE  POLITICAL  FACTORS 
IN  COMPLEX  SITUATIONS. 


2.  Influences:  Officers  described  acting  to  in¬ 
fluence  OTHERS  WITHOUT  HAVING  TO  RESORT  TO  DIRECT  ORDERS 
OR  THREATS,  USING  INFLUENCE  EFFECTIVELY  TO  ACHIEVE  THEIR 
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ENDS  WHILE  MAKING  OTHERS  FEEL  MORE  EFFICACIOUS  IN  THE 
PROCESS. 

3,  Conceptualizes:  Officers  described  a  high 
level  of  conceptual  ability  in  problem  identification, 
systems  analysis,  and  policy  formulation.  This  compe¬ 
tency  IS  THE  ABILITY  TO  SEE  PATTERNS  IN  COMPLEX  DATA, 
SEPARATE  IMPORTANT  INFORMATION  FOR  UNIMPORTANT,  DEVELOP 
INTEGRATIVE  CONCEPTS  AND  PRINCIPLES  AND  SUPPORT  THESE 
WITH  SPECIFIC  DATA,  AND  RECONCILE  EXCEPTIONS  AND  DIS¬ 
CREPANCIES,  USUALLY  WITH  REGARD  TO  HAVING  AN  IMPACT  ON 
OTHERS  OR  ON  THE  SYSTEM. 


4,  Team  Builds:  Officers  described  encouraging 

SUBORDINATES  TO  WORK  TOGETHER  AS  A  TEAM,  TO  "BUY  INTO" 
SHARE*  J.IIT  OR  COMMAND  PERFORMANCE  GOALS,  AND  TO  CREATE 
SYM301S  AND  EVENTS  WHICH  STIMULATED  UNIT  PRIDE  AND  IDEN¬ 
TITY, 


5.  Rewards:  Officers  reported  rewarding  others 

FOR  GOOD  TASK  PERFORMANCE  TO  INFLUENCE  AND  MOTIVATE  SUB¬ 
ORDINATES, 


FACTOR  ill;  MANAGEMENT  CONTROL 


Officers  described  using  a  straightforward  management 

SEQUENCE  OF  PLANNING  AND  ORGANIZING,  ISSUING  DIRECTIONS, 
DELEGATING,  MATCHING  PEOPLE  TO  JOBS  TO  BE  DONE,  MONITORING 
RESULTS  AND  GIVING  FEEDBACK  IN  MANY  INCIDENTS,  FlVE  COMPE¬ 
TENCY  CATEGORIES  WERE  INCLUDED  IN  THIS  FACTOR: 

1,  Flaks  and . Pagan ize$:  Officers  reported 

IDENTIFYING  THE  ACTIONS  THEY  NEEDED  TO  TAKE  AT  ONE  POINT 
IN  TIME  TO  ACHIEVE  RESULTS  AT  SOME  LATER  TIME,  SPECIFYING 
PERSONNEL,  MATERIALS  AND  OTHER  RESOURCES  NEEDED,  AND 
PRIORITIZING  TASKS  TO  BE  ACCOMPLISHED. 

2.  Directs:  Officers,  when  they  did  not  influence 

SUBORDINATES,  CLEARLY  DIRECTED  THEM  TO  PERFORM  TASKS  WITH¬ 
OUT  EXPLANATION  AND  IN  THE  ABSENCE  OF  PERSONALIZED  THREATS 
OR  PUNISHMENT. 

3,  Delegates :  Officers  described  conscious  use  of 

THE  CHAIN  OF  COMMAS  TO  GET  SUBORDINATES  TO  TAKE  RESPONSI¬ 
BILITY  FOR  TASKS, 

4.  Optimizes  (people- task):  ’Officers  reported  real¬ 
istically  ASSESSING  PEOPLE  IN  MAKING  PERSONNEL  DECISIONS  TO 
ASSIGN  TASKS  TO  THOSE  INDIVIDUALS  MOST  LIKELY  TO  DO  THEM 
WELL,  AND  IN  MAKING  TRADE-OFFS  BETWEEN  TASK  REQUIREMENTS 

AND  INDIVIDUAL  NEEDS  (IN  i\!AVY  PARLANCE,  M INTEGRATION  OF  MEN 
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AND  MISSION*), 


5.  Monitors  Results:  Officers  reported  monitor¬ 
ing  FOLLOWUP,  CHECKING  BACK  TO  SEE  IF  MANAGEMENT  ACTIONS, 
SUBORDINATES  OR  EQUIPMENT  IN  FACT  ACCOMPLISHED  WHAT  THEY 
WERE  EXPECTED  TO  ACCOMPLISH  IN  A  GIVEN  TIME  PERIOD. 

6.  jfaflLYSS.toflKTS:  OFFICERS  DESCRIBED  NEGO¬ 
TIATING  OR  MEDIATING  INTERPERSONAL  DISPUTES  TO  A  SUCCESS¬ 
FUL  RESOLUTION,  DEFINED  AS  A  *WIN-WJNM  SOLUTION,  IN  WHICH 
BOTH  PARTIES  IN  THE  DISPUTE  WERE  RELATIVELY  SATISFIED  AND 
NEITHER  LOST  A  DISPROPORTIONATE  AMOUNT  OF  POWER,  STATUS  OR 
RESOURSES. 

7.  Gives  Feedback:  Officers  reported  giving 

SPECIFIC  FEEDBACK  TO  SUBORDINATES  ON  THEIR  TASK  PERFORMANCE, 

FACTOR  IV:  ADVISING  AND  COUNSELING 


liANY  LEADERS  AND  MANAGERS  DESCRIBED  LISTENING  TO  AND 
COUNSELING  SUBORDINATES  IN  A  HIGH  PERCENTAGE  OF  THEIR  CRIT¬ 
ICAL  LEADERSHIP  INCIDENTS.  COUNSELING  INCIDENTS  DEALT  WITH 
FOUR  ISSUES:  PERFORMANCE,  DISCIPLINARY  MATTERS,  PERSONAL 
PROBLEMS  (INCLUDING  DRUG,  ALCOHOL,  FINANCIAL  AND  FAMILY 
DIFFICULTIES)  AND  CAREER  PLANNING,  FOUR  COMPETENCY  CATE¬ 
GORIES  ARE  INCLUDED  IN  THIS  FACTOR: 
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1.  Listens;  Officers  reported  noticing  when 

> 

SUBORDINATES  APPEARED  TO  BE  HAVING  PROBLEMS,  APPROACHING 
PEOPLE  TO  INVITE  THEM  TO  TALK  ABOUT  ISSUES  CONCERNING 
THEM,  OR  BEING  PERCEIVED  AS  APPROACHABLE  (\..ThE  BUCK 
GUYS  CAME  TO  SEE  ME  BECAUSE,  THEY  SAID,  THEY  FELT  I  WAS 
THE  ONLY  ONE  THEY  COULD  TALK  TO."). 

2.  Understands:  Officers  described  being  able 
to  "hear  what  others  are  trying  to  say"  (accurate  empathy 
or  insight  into  subordinates'  needs,  motives  or  hidden 
agenda) . 

3.  Helps:  Officers  detailed  the  actions  they 
took  to  help  subordinates  in  counsel. ng  situations,  in¬ 
cluding  GIVING  ADVICE,  MAKING  TIME  AVAILABLE  TO  TALK, 

ACTING  DIRECTLY  TO  "FIGHT  FOR  THEIR  PEOPLE,"  OR  MAKING 
APPROPRIATE  REFERRALS  TO  SOURCES  OF  HELP  (MEDICAL  PERSON¬ 
NEL,  CHAPUINS,  DRUG  AND  ALCOHOL  TREATMENT  FACILITIES). 

4.  Positive  £x££OA1IMS.:  Officers  expressed 

POSITIVE  EXPECTATIONS  OF  AND  REGARD  FOR  THEIR  SUBORDINATES, 

FACTOR  V:  COERCION 

i»AVY  OFFICERS  DESCRIBED  CRITICAL  INCIDENTS  IN  WHICH 
THEY  USED  RANK  OR  THREATS  TO  MOTIVATE  OTHERS  TO  ACT,  EX¬ 
PRESSED  NEGATIVE  EXPECTATIONS  OF  SUBORDINATES,  DISCIPLIN'D 

THEM,  aCTED  IMPULSIVELY,  AND  RESOLVED  CONFLICTS  BY  FORCE 
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OR  FAILED  TO  RESOLVE  THEM,  PlVE  COMPETENCY  CATEGORIES 

i 

WERE  INCLUDED  IN  THIS  FACTOR: 

1.  Coerces:  Officers  described  using  rank  and 

BOTH  GENERAL  AND  PERSONALIZED  THREATS  TO  MOTIVATE  SUB¬ 
ORDINATES. 

2,  Negative  Expectations :  Officers  expressed 

NEGATIVE  REGARD  AND  EXPECTATIONS  FOR  THEIR  SUBORDINATES 
("HE  13  NO  GOOD  —  THERE'S  NO  WAY  HE'S  GOING  TO  MAKE  IT*). 

>.  Disciplines:  Officers  described  punishing  sub¬ 
ordinates  BY  GIVING  THEM  NEGATIVE  FEEDBACK,  POOR  FITNESS  OR 
EVAIUA  r I  ON  REPORTS,  OR  USING  STANDARD  UCMJ  PROCEDURES. 

*i,  Acts  Impulsively:  Officers  reported  expressing 

THEIR  EMOTIONS  WITHOUT  INHIBITION  —  PRIMARILY  ANGER  ("I 
BLEW  MY  STACK  AT  HIM")  AND  OCCASIONALLY  AFFILIATION  ("My 
BUDDIES  HAD  TO  COME  FIRST"). 

5,  Pails. JO...{fesm.E.-,.,CQM:Lici3 ;  Officers  described 

SITUATIONS  IN  WHICH  THEY  DID  NOT  REACH  "WIN-WIN"  RESOLUTIONS 
OF  CONFLICTS,  EITHER  BECAUSE  THEY  RESOLVED  CONFLICTS  BY 
UNILATERAL  FORCE  QR  BY  AVOIDING  DEALING  WITH  THE  CONFLICT. 
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i 

‘  Construct  validation  for  these  factors  and  supporting 

CATEGORIES  IS  AVAILABLE  IN  THE  LITERATURE,  AND  FOR  THOSE. 

WHO  WISH  TO  PROBE  DEEPER  INTO  THIS  ASPECT  OF  THE  COMPETEN¬ 
CIES,  THE  McBER  REPORT  MENTIONED  EARLIER  CITES  THE  RELE¬ 
VANT  REFERENCES. 

After  these  competency  factors  and  categories  were 

DERIVED  FROM  THE  ANALYSIS  OF  THE  DATA  GATHERED  FROM  THE 
INTERVIEWS  IN  BOTH  THE  PACIFIC  AND  ATLANTIC  FLEETS,  THEY 
WERE  CROSS  VALIDATED  IN  FOLLOW-ON  INTERVIEWS  IN  BOTH 
FLEETS  TO  DETERMINE  THE  EXTENT  TO  WHICH  THE  CRITERIA  COULD 
DISCRIMINATE  BETWEEN  "SUPERIOR"  AND  "AVERAGE"  COMMISSIONED 
AND  NON-COMMISSIONED  OFFICERS  DRAWN  FROM  EACH  FLEET.  THE 
FIRST  FOUR  FACTORS,  THAT  IS,  TASK  ACHIEVEMENT,  SKILLFUL 

Use  of  Influence,  Management  Control  and  Advising  and 
Counseling,  clearly  distinguished  between  the  two  categories 
of  officers  and  petty  officers,  although  the  factor  called 
Coercion  failed  to  so  discriminate,  it  appearing  equally  as 

A  CHARACTERISTIC  of  EOT H  "SUPERIOR"  AMD  "AVERAGE"  PERSONNEL, 

Perhaps  this  is  an  appropriate  puce  to  address  the 

QUESTION  OF  THE  DIFFERENCE,  IF  ANY,  BETWEEN  JOB  COMPETENCY 
ASSESSMENT  AND  job/task  ANALYSIS,  I T.  WOULD  SEEM  THAT  THEY 
BOTH  ACCOMPLISH  A  COMMON  RESULT ,,, THEY  PROVIDE  A  DATA  BASE 
FROM  WHICH  INSTRUCTIONAL  PROGRAMS  AND  ACHIEVEMENT  MEASURES 
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MAY  BE  DERIVED,  AND  THEY  PROVIDE  A  MEANS  BY  WHICH  WE  MAY 
PRIORITIZE  SKILLS  AND  KNOWLEDGES  REQUIRED  FOR  THE  SUCCESSFUL 
ACCOMPLISHMENT  OF  A  JOB  SO  THAT  WE  MAY  PLACE  OUR  TRAINING 
RESOURCES  WHERE  THEY  WILL  DO  The  MOST  GOOD,  BUT  THERE  ARE 
ALSO  SOME  VERY  IMPORTANT  DIFFERENCES  BETWEEN  THEM,  IT 
SEEMS  TO  US,  AND  THESE  SHOULD  BE  UNDERSTOOD.  FIRSTLY,  AND 
MOST  IMPORTANTLY,  JOB/TASK  ANALYSIS  PROVIDES  INFORMATION  AS 
TO  WHAT  MUST  BE  ACCOMPLISHED  IN  ORDER  TO  PERFORM  A  SPECIFIC 
TASK  OR  JOB,  WHEREAS  JOB  COMPETENCY  ASSESSMENT  PROVIDES 
INFORMATION  ABOUT  tm  TO  GET  THE  JOB  DONE  IN  THE  BEST  AND 
MOST  EFFECTIVE  FASHION.  SECONDLY,  WHILE  BOTH  ADDRESS  THE 
COGNITIVE  DOMAIN,  JOB/TASK  ANALYSIS  DEALS  PRIMARILY  WITH 
THE  PSYCHOMOTOR  DOMAIN  WHILE  JOB  COMPETENCY  ASSESSMENT  DEALS 
PRIMARILY  WITH  THE  AFFECTIVE  DOMAIN.  AND  THIRDLY,  JOB/TASK 
ANALYSIS  MOST  OFTEN  CONCERNS  ITSELF  WITH  TECHNICAL  MATTERS, 
WHILE  JOB  COMPETENCY  ASSESSMENT  CONCERNS  ITSELF  WITH  NON¬ 
TECHNICAL  MATTERS, 

AS  W£  HAVE  DISCUSSED  BRIEFLY,  THE  JOB  COMPETENCY  ASSESS¬ 
MENT  FUNCTION  HAS  PROVIDED  US  WITH  A  DATA  BASE  UPON  WHICH  TO 
CONSTRUCT  THE  CURRICULUM  FOR  THE  TRAINING  PROGRAM  IN  LEADER¬ 
SHIP  AND  MANAGEMENT  SKILLS,  It  HAS  ALSO  PROVIDED  US  AT  THE 
SAME  TIME  WITH  THE  DATA  BASE  FROM  WHICH  TO  DEVELOP  INSTRU¬ 
MENTS  FOR  THE  MEASUREMENT  OF  PRE-  AND  POST-TRAINING  SKILLS  IN 
LEADERSHIP  AND  MANAGEMENT. 
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The  development  and  packaging  of  a  set  of  assessment 

i 

INSTRUMENTS  TO  MEASURE  THE  COMPETENCIES  IDENTIFIED  AS  THOSE 
DISTINGUISHING  BETWEEN  SUPERIOR  AND  AVERAGE  LEADERS  AND 
MANAGERS  IS  CALLED  THE  NAVY  LEADERSHIP  AND  MANAGEMENT 

Skills  Test  Battery.  It  has  been  administered  to  approxi¬ 
mately  ONE  THOUSAND  COMMISSIONED  AND  NON-COMMISSIONED 
OFFICERS  AT  EIGHT  LEADERSHIP  LEVELS,  ALTHOUGH  AT  THE  TIME 
OF  THIS  PRESENTATION  THE  DATA  ARE  NOT  YET  AVAILABLE  FOR 
REPORTING.  THE  LEVELS  AT  WHICH  THE  BATTERY  WAS  ADMINISTERED 
WERE  PETTY  OFFICER,  LEADING  PETTY  OFFICER,  CHIEF  PETTY 
OFFICER,  MASTER  CHIEF  PETTY  OFFICER,  DIVISION  OFFICER, 
DEPARTMENT  HEAD,  EXECUTIVE  OFFICER  AND  COMMANDING  OFFICER. 

The  PURPOSES  OF  this  test  battery  development  were  AS 

FOLLOWS: 


1.  lo  MEASURE  THE  COMPETENCIES  IDENTIFIED  AS 
DISTINGUISHING  BETWEEN  SUPERIOR  AND  AVERAGE  NAVY  LEADERS 
AND  MANAGERS. 

2.  To  DEVELOP  TESTS  WHICH  MEET  THE  AMERICAN  PSYCHOL¬ 
OGICAL  Association  psychometric  standards  for  reliability 

AND  VALIDITY. 

3.  To  DEVELOP  TESTS  THAT  ARE  NAVY  RELEVANT  AND 
FACE  VALID, 
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4.  To  DEVELOP  TESTS  THAT  ARE  EASILY  ADMINISTERED 
AND  SCORED. 

NOW  INASMUCH  AS  COMPETENCIES  DESCRIBE  ATTITUDES  PEOPLE 
HAVE  OR  ACTIONS  THEY  TAKE,  INSTEAD  OF  WHAT  THEY  KNOW,  TESTS 
MUST  MEASURE  WHAT  PEOPLE  DO,  NOT  WHAT  THEY  KNOW.  UNFORTU¬ 
NATELY,  MOST  TESTS  AVAILABLE  HAVE  BEEN  DESIGNED  TO  MEASURE 
KNOWLEDGE.  SO  A  NEW  APPROACH  TO  THE  DESIGN  OF  TESTS  HAD 
TO  BE  MADE,  AND  THIS  WAS  DONE  UNDER  THE  GUIDANCE  OF  Dr, 

McClelland  of  the  McBer  Company.  These  tests  have  the 
following  attributes: 

1.  They  assess  competencies,  that  is,  those  be¬ 
haviors  THAT  DISTINGUISH  BETWEEN  AVERAGE  AND  SUCCESSFUL 
PERFORMANCE  INVOLVED  IN  CLUSTERS  OF  LIFE  SKILLS, 

2.  They  should  be  developed  by  examining  the  be¬ 
haviors  OF  PEOPLE  EXHIBITING  THOSE  BEHAVIORS  TO  BE  MEASURED, 

3.  They  should  tap  operant  as  well  as  respondent 

THOUGHT.  The  RESULTING  BATTERY  ARE  A  COMPOSITE  OF  FOUR 
TESTS  WHICH  WERE  AVAILABLE  IN  THE  MARKET,  AND  FOUR  WHICH 
REQUIRED  DEVELOPMENT  BY  THE  McBER  TEST  DESIGNERS.  TABLE  I 
IS  A  DESCRIPTION  OF  THESE  TESTS  BY  FUNCTION  AND  NAME,  AND 
INDICATES  WHICH  ARE  THOSE  AVAILABLE  FROM  STANDARD  TESTS 

AVAILABLE  IN  THE  MARKET,  AND  WHICH  ARE  THE  RESULT  OF  HcBER 
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DEVELOPMENT.  THE  OUTCOME  OF  THE  PILOT  TESTS  OF  THESE 
MEASURES/  AND  A  MORE  ELABORATE  AND  COMPLETE  DISCUSSION 
OF  THE  CONSTRUCTION  OF  THE  NEW  MEASURES  MAY  BE  OBTAINED 

from  the  Chief  of  Naval  Personnel  if  desired, 

TABLE  I 

Tests,  their  Format,  and  Competencies  tested  for  in  the 
Navy  Leadership  and  Management  Skills  Test  Battery 


Tests  Which.  .Emi 


I ESI 

EQML 

CQMPEimiS 

Strong- Campbell 

Rate  your  prefer¬ 

Technical  Pro¬ 

Interest  Inventory 

ences  ON  A  NUMBER 

OF  DIMENSIONS 

blem  Solving 

Picture  Story 

Write  imaginative 

Concern  for  A- 

Exercise 

stories  to  each  of 

chievement  tales 

SIX  pictures 

Initiative  Has 

concern  for  In¬ 
fluence.  Has 

self-control 

% 

Acts  impulsively 
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TESTS  WHICH  mSUSswiWEal 


IESI 

TOM 

CCMTEItCiES 

» 

Work  Analysis 

Express  your  work 

Concern  for 

Questionnaire 

PREFERENCES  ON  A 

ACHIEVEMENT 

SERIES  OF  STATE¬ 

Concern  for 

MENTS 

INFLUENCE 

Organizational 

Describe  an 

Concern  for 

Climate  Survey 

IDEAL  WORK  CLI¬ 

ACHIEVEMENT 

Questionnaire 

MATE  FOR  YOU  BY 

Concern  for 

RESPONDING  TO  A 

SERIES  OF  STATE¬ 
MENTS 

INFLUENCE 

imSJMKiLJ€ELj}LyELQEEIt 

IESI 

TOM 

mnmLM 

Listening  and 

Write  out  ans¬ 

Listens 

Counseling,  Part  I 

wers  ABOUT  TAPED 

Understands 

VOICE  EXCERPTS 

AND  ABOUT  PIC¬ 
TURES 

Helps 

Listening  and 

Answer  questions 

'  Listens 

Counseling,  Part  I! 

AFTER  HEARING’ 

Understands 

EACH  OF  FOUR 

Has  Positive 

MONOLOGUES 

Expectations 

DEVELOPED  (Continues) 

F-QBttftl 

Choose  the  three 

PREFERRED  RESPONSES 
OUT  OF  SIX  RESPONSES 
GIVEN  FOR  EACH  OF 
TWENTY  SITUATIONS, 


JESTS  HHICHJPE 
IESI 

Management  of 
Problems  Test 


Managerial 

Style 

Questionnaire 


Give  each  of  three 

RESPONSES  TO  A  SER¬ 
IES  OF  SENTENCES  A 
PREFERRED  NUMBER  OF 
POINTS 


CQMPETEflCiES 
Influences 
Team  Builds 
Directs  (Authoi- 
tarian) 

Resolves  Con¬ 
flicts 
Coerces 

Fails  to  Resolve 
Conflicts 

Coaches 
rewards 
Plans  and 
Organizes 
Delegates 
Monitors  Results 
Gives  Feedback 
Punishes 


Planning  Find  the  most  • .  Plans  and 

Exercise  efficient  schedule  Organizes 

FOR  A  SERIES  OF 
TASKS 
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TO  SUMMARIZE;  WE  MAY  SAY  THAT  FOR  THE  FIRST  TIME  IN 

the  Navy  there  has  seen  described  through  valid  research 

A  SET  OF  COMPETENCIES;  THAT  IS;  SKILLS  AND  CHARACTERISTICS 
WHICH  ARE  DEMONSTRATA3LY  CAPABLE  OF  DISCRIMINATING  BETWEEN 
SUPERIOR  AND  AVERAGE  LEADERS  AND  MANAGERS  AMONG  NAVAL 
OFFICERS  AND  PETTY  OFFICERS;  AND  WHICH  CAN  THEREFORE  BE 
RELIABLY  USED  AS  A  DATA  BASE  FOR  THE  DEVELOPMENT  OF  TRAIN¬ 
ING  PROGRAMS  IN  LEAPBSHIP  AND  MANAGEMENT;  INCLUDING  THE 
INSTRUMENTS  FOR  THE  ;E*~>!R£MENT  OF  ACHIEVEMENT  OF  THE 
REQUIRED  BEHAVIOR.  FOR  tO .0  LEADERSHIP  AND  MANAGEMENT  IN 

the  Navy.  The  Bureau  of  Naval  Personnel;  through  the  work 

OF  THEIR  CONTRACTOR;  THE  MER  AND  COMPANY  OF  BOSTON;  ARE 
CURRENTLY  DEVELOPING  LEADERSHIP  AND  MANAGEMENT  TRAINING 
PROGRAMS  AT  A  NUMBER  OF  ACCESSION  POINTS  AND  GRADE  LEVELS 
OF  NAVAL  ’’  tRSONNEL;  AND  THE  NAVAL  EDUCATION  AND  TRAINING 
COMMAND;  TO  BE  LATER  CHARGED  WITH  THE  RESPONSIBILITY  FOR 
THE  IMPLEMENTATION  OF  THESE  PROGRAMS;  IS  CAREFULLY  MONITOR 
ING  THE  PROCESS,  INITIAL  IMPLEMENTATION  IS  PLANNED  TO 
COMMENCE  IN  1978. 


\ 
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TESTING  IN  THE  AFMET  PROGRAM 
Wallace  Bloom,  Ph.D. 
Wilford  Hall  USAF  Medical  Center 


Introduction:  Premature  termination  of  military  service  is 
often  caused  by  psychological  problems  that  existed  prior  to  service 
and  led  to  poor  emotional  adaptation  to  military  life.  There  has 
been  previous  research  towards  the  development  of  an  objectively 
scored  questionnaire  capable  of  discriminating  between  individuals 
who  will  develop  emotional  or  characterological  difficulties  during 
training  and  these  who  will  successfully  adjust  to  military  life. 
Danielson  and  Clark  (1954)  studied  a  sample  of  15,550  Army  recruits; 
Jensen's  (1964)  82-item  questionnaire  had  been  given  to  9,194  Air 
Force  recruits,  and  Plag  (1962)  used  a  195-item  questionnaire  on 
20,000  Navy  recruits.  LaChar,  et  al  (1974)  reported  on  a  1972 
investigation  of  14,804  male  Air  Force  recruits  and  the  evolution 
of  the  history-opinion-interest  from  (HOI) .  Guinn,  et  al  (1975) 
concluded  that  the  HOI  has  some  practical  usefulness  as  a  rough 
preliminary  screening  device.  They  recommended  that  use  of  this 
screening  device  be  limited  to  preliminary  screening  only  and 
that  addition  psychometric  and/or  psychiatric  assessment  be 
mandatory  before  any  personnel  action  is  recommended. 

A  proposal  for  further  research  and  development  of  a  military 
adaptability  screening  test  was  prepared  in  1974  by  Captain  Charles 
I.  Bisbee,  Major  George  E.  Hargrave,  and  Colonel  John  C.  Sparks. 

The  research  protocol  desioned  by  the  Department  of  Mental  Health, 
Wilford  Hall  USAF  Medical  Center,  in  conjunction  with  the  Surgeon, 
Air  Training  Command  and  other  ATC  agencies,  was  implemented  in 
June  1975  and  identified  as  AFMET  (Air  Force  Medical  Evaluation 
Test  Program) .  (Bloom,  1977a) 

PHASE  I:  On  the  arrival  day,  each  airman  entering  the  Air 
Force  was  qive:  a  psychological  test.  Those  identified  by  computer 
scoring  as  low  mental  health  risks  (approximately  93%)  continued 
basic  training  without  further  evaluation. 

PHASE  II:  By  the  first  day  of  training,  those  not  identified 
as  low-risk  were  called  in  for  individual  mental  status  interviews 
with  mental  healtn  technicians  and  additional  psychological  tests 
are  given.  The  reports  of  these  interviews  and  tests  are  reviewed 
by  a  senior  clinical  psychologist  who  determined  which  additional 
trainees  will  now  be  identified  as  lo<#-risk  and  did  not  require 
further  evaluation. 


Presented  at  the  l9th  Conference  of  the  Military  Testing  Association, 
October  18,  1977,  San  Antonio,  Texas. 
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PHASE  III:  Usually  by  the  fifth  day  of  training,  those  airmen 
(approximately  2  to  31  of  the  trainees)  not  already  identified  as 
Iw-risk  were  referred  to  the  Mental  Hygiene  Clinic  for  further 
evaluations.  They  brought  with  them  reports  of  behavior  observations 
and  comments  from  their  training  squadrons  (ATC  Form  582) .  Clinical 
mental  health  interviews  were  conducted  and  additional  psychological 
tests  are  selected  and  given  on  an  individual  basis  as  necessary. 
Those  without  serious  problems  were  returned  to  duty,  and  the  few 
identified  as  being  psychotic  were  referred  to  the  hospital  for 
treatment  and  further  action.  Some  showed  evidence  of  specific 
character  and  behavior  disorders  of  such  a  nature  as  to  seriously 
impair  military  performance.  They  were  referred  back  to  their 
training  squadron  commanders  with  recommendations  for  administra¬ 
tive  separation  with  medical  classification  and  behavior  reports. 

A  few  are  recommended  for  referral  to  other  special  agencies.  (See 
Illustration  1). 

HOI :  Phase  I  testing  was  limited  to  the  original  History  Opinion 
Inventory  (HOI)  composed  of  100  items  that  the  subject  was  to 
identify  as  True  or  False  as  applied  to  him.  Examples  are:  1 
quit  school  because  I  was  failing.  I  was  active  in  sports  during 
high  school.  I  have  been  in  trouble  with  the  police.  I  like 
hunting  very  much.  Marking  was  on  opscan  type  answer  sheets  which 
were  scored  by  computer  and  the  names  and  other  identification  of 
those  selected  for  further  testing  were  contained  in  the  printout 
sheets. 

In  1975,  during  the  initial  days  of  the  project,  it  was 
quickly  learned  that  the  HOI  Adaptation  Index  cutoff  score  of 
12  that  had  identified  121  of  the  sample  in  1972  only  identified 
H  of  It  in  1975.  It  was  believed  that  due  to  the  termination  of 
the  Draft  Law  and  other  factors,  the  new  recruits  were  scoring 
significantly  different  that  the  earlier  subjects.  The  cutoff 
score  was  reduced  4  points.  38,529  basic  trainees  took  the  HOI 
between  June  1  and  November  7,  1975,  and  approximately  61  were 
identified  for  Phase  II  screening.  (Table  l). 

The  HOI  has  two  subscales.  PEI  (Emotional  Instability) 
consists  of  18  scored  items  and  the  PDA  (Drug  Use  Admission  of 
26  scored  items.  Nine  of  the  items  are  critical  on  both  sub¬ 
scales.  In  November  1975,  the  test  was  reduced  from  100  items 
to  50  without  changing  or  eliminating  any  of  the  scored  items. 

The  five-page  booklet  was  replaced  by  a  one-page  card  which 
could  be  aligned  with  the  answer  sheet  to  minimize  marking 
errors.  The  cutoff  score  remained  unchanged. 

On  1  October  1976,  the  AFMET  program  shifted  from  a  Research 
Project  to  a  Standard  Operating  Procedure.  The  HOI  test  was  given 
after  the  first  day  of  training  rather  than  on  arrival.  The  train¬ 
ees  would  have  spent  at  l^ast  one  night  in  their  dorms,  and  met 
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PHASE  III?  Usually  by  the  fifth  day  of  training,  those  airmen 
(approximately  2  to  3%  of  the  trainees)  not  already  identified  as 
low-risk  were  referred  to  the  Mental  Hygiene  Clinic  for  further 
evaluations.  They  brought  with  them  reports  of  behavior  observations 
and  comments  from  their  training  squadrons  (ATC  Form  582).  Clinical 
mental  health  interviews  were  conducted  and  additional  psychological 
tests  are  selected  and  given  on  an  individual  basis  as  necessary. 
Those  without  serious  problems  were  returned  to  duty,  and  the  few 
identified  as  being  psychotic  were  referred  to  the  hospital  for 
treatment  and  further  action.  Some  showed  evidence  of  specific 
character  and  behavior  disorders  of  such  a  nature  as  to  seriously 
impair  military  performance.  They  were  referred  back  to  their 
training  squadron  commanders  with  recommendations  for  administra¬ 
tive  separation  with  medical  classification  and  behavior  reports. 

A  few  are  recommended  for  referral  to  other  special  agencies.  (See 
Illustration  1). 

HOI:  Phase  I  testing  was  limited  to  the  original  History  Opinion 
Inventory  (HOI)  composed  of  100  items  that  the  subject  was  to 
identify  as  True  or  False  as  applied  to  him.  Examples  aret  1 
quit  school  because  I  was  failing.  I  was  active  in  sports  during 
high  school.  I  have  been  in  trouble  with  the  police.  I  like 
hunting  very  much.  Marking  was  on  opscan  type  answer  sheets  which 
were  scored  by  computer  and  the  names  and  other  identification  of 
those  selected  for  further  testing  were  contained  in  the  printout 
sheets. 

In  1975,  during  the  initial  days  of  the  project,  it  was 
quickly  learned  that  the  HOI  Adaptation  Index  cutoff  sccre  of 
12  that  had  identified  121  of  the  sample  in  1972  only  identified 
H  of  1%  in  1975.  It  was  believed  that  due  to  the  termination  of 
the  Draft  Law  and  other  factors,  the  new  recruits  were  scoring 
significantly  different  that  the  earlier  subjects.  The  cutoff 
score  was  reduced  4  points.  38,529  basic  trainees  took  the  HOI 
between  June  1  and  November  7,  1975,  and  approximately  6%  were 
identified  for  Phase  II  screening.  (Table  l). 

The  HOI  has  two  subscales.  PEI  (Emotional  Instability) 
consists  of  18  scored  items  and  the  PDA  (Drug  Use  Admission  of 
26  scored  items.  Nine  of  the  items  are  critical  on  both  sub¬ 
scales.  In  November  1975,  the  test  was  reduced  from  100  items 
to  50  without  changing  or  eliminating  any  of  the  scored  items. 

The  five-page  booklet  was  replaced  by  a  one-page  card  which 
could  be  aligned  with  the  answer  sheet  to  minimize  marking 
errors.  The  cutoff  score  remained  unchanged. 

On  1  October  1976,  the  AFMET  program  shifted  from  a  Research 
Project  to  a  Standard  Operating  Procedure.  The  HOI  test  was  given 
after  the  first  day  of  training  rather  than  on  arrival.  The  train¬ 
ees  would  have  spent  at  least  one  night  in  their  dorms,  and  met 
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their  primary  instructors  prior  to  taking  this  test.  During  a  few 
trial  days,  it  was  found  that  the  cutoff  score  of  8  points  now 
identified  22%  instead  of  the  approximate  6  to  It  identified  fro* 
June  1,  1975  to  June  15,  1976.  The  cutoff  score  was  raised  to  11 
points  with  daily  monitoring  to  shift  it  to  12  when  the  quota  might 
be  exceeded. 

MMPI :  The  selectees''  MMPI  responses  were  marked  on  Optiscan  sheets, 
machine  scored,  and  computer  print-outs  were  returned  a  day  later. 
These  print-outs  were  based  on  a  program  developed  by  LaChar  (1974). 
The  consolidated  data  (Standardised  Report  of  Interview  (SRI)  and 
MMPI)  were  then  reviewed  by  a  clinical  psycnologist  who  determined 
which  "low  risk"  individuals  warranted  continuation  in  training 
without  referral  to  the  Mental  Hygiene  Clinic  for  further  action. 

The  others  were  referred  to  that  clinic. 

Approximately  400  subjects  were  given  MMPI's  each  month 
(Bloom,  1977b).  As  the  program  continued,  it  was  noted  that  a 
significant  number  of  airmen  scored  highly  elevated  MMPI  T-scores, 
often  above  100,  but  were  reported  as  within  normal  liraivs  by  the 
interviewers.  During  the  first  two  months  of  the  study,  many  air¬ 
men  with  these  elevated  MMPI  scores,  but  low  risk  interview  reports, 
were  referred  to  the  Mental  Hygiene  Clinic.  Further  psychiatric 
assessment  there  usually  resulted  in  these  airmen's  continuation 
in  basic  training.  The  MMPI  norms  used  were  based  on  tne  general 
Minnesota  normal  sample  (Dahlstrom,  et  al,  1960,  pp  437-8).  During 
July,  the  MMPI's  of  17  year-old  airmen  were  recorded  by  hand  using 
the  T-score  conversions  for  basic  scales  without  K  corrections  of 
Minnesota  Adolescents,  age  17  (Dahlstrom,  et  al,  1972,  pp  397-99). 
Although  extreme  T-scores  were  somewhat  less  elevated  than  when 
converted  the  traditional  way,  it  was  questioned  whether  these 
norms  verc  appropriate.  It  was  hypothesized  that  neither  the 
traditional  adult  nor  adolescents,  age  17  norms  were  relevant  for 
judging  the  MMPI  responses  of  basic  airmen. 

Subjects :  Both  male  and  female  recruits  in  basic  training 
squadrons  were  tested  rather  than  the  HOI  selectees  which  might 
have  been  a  skewed  sample.  Entire  squadrons  were  tested  during 
the  period  August  to  October,  1975.  Personnel  came  from  all  over 
the  United  States.  Data  was  obtained  from  1152  males  and  805 
females. 

Procedures:  Standard  MMPI  booklets  were  used  and  responses 
were  marked  on  Opscan  sheets  for  subsequent  machine  scoring. 

Raw  scores  for  each  category  were  punched  on  IBM  cards  and  the 
data  statistically  analyzed  by  the  Biometrics  Division  of  the  Air 
Force  Systems  Command.  The  analysis  of  variance  for  disproporti- 
nate  data  using  a  general  regression  model  was  based  on  Graybill's 
work. 
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Results:  The  demographic  composition  of  the  Lackland  popula¬ 
tion  reflected  recruit  criteria  that  each  must  be  high  school 
graduates  or  pass  a  GED  test  equivalent.  Only  3.41  of  the  men  and 
2.8%  of  the  women  were  not  high  school  graduates.  12.2%  of  the 
males  and  18.91  of  the  females  had  reported  education  beyond 
high  school  level  (See  Table  2) .  Over  eighty  (80%)  percent  were 
"Anglo-Americans"  (Caucasian) ,  and  about  ten  (10)  percent  of  the 
total  sample  were  "Black-Americana" .  Further  details  are  shown 
on  Table  3.  Only  two-percent  of  the  men,  but  almost  i3  percent  of 
the  woman  were  over  23  years  old  (See  Table  4). 

Means  and  Standard  Deviations  on  Basic  MMPI  factors  were 
obtained  for  1152  male  and  80S  female  participants  (See  Table  5). 
Comparisons  with  traditional  MMPI  norms  for  adults  revealed 
statistically  significant  differences  in  all  scaled  except  Social 
Introversion  (0-Si) .  A  second  set  of  analyses  between  the  Lackland 
population  and  the  norms  of  Minnesota  Normal  Adolescents,  age  17, 
indicated  significant  differences  in  almost  all  scales  (except 
Males:  Lie,  Depression,  and  Hysteria  and  except  Females:  Lie  and 
Paranoia) . 

Conclusions :  Norms  for  purposes  of  comparison  must  be 
relevant  and  will  be  meaningless  or  even  misleading  if  they  are 
not  based  on  groups  of  people  with  whom  it  is  sensible  to  compare 
the  individuals  we  are  psychologically  assessing.  Neither  the 
MMPI  norms  for  Minnesota  Adults  nor  for  Minnesota  Normal  Adoles¬ 
cents  age  17,  were  relevant  for  comparison  with  Air  force  Basic 
Trainees. 


The  standard  MMPI  norms  were  not  designed  for  use  with  a 
young  adult  military  population  and  characteristics  cited  by 
Hathaway  and  Briggs  (1957)  follow: 


*  *  *  —  - - 

VARIABLES 

LACKLAND 

MINNESOTA 

Geographic 

National 

Regional 

Age 

Predominately  17-23 

Mean  33 

Race 

Included  minorities 

White 

Education 

12  or  more 

Mean  9.7  to  10. i 

Population 

Male 

13  52 

111  to  345 

Female 

805 

118  to  397 

Note 

varied  by  sea  It) 

Tested 

1975 

1940 

It  was  appropriate  to  establish  new  norms  based  on  these 
individuals  and  this  has  been  done  along  with  appropriate  T-conver- 
aions  (Tables  6  and  7)  and  plotting  charts  (Illustrations  2,3). 
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From  November  15,  1975  to  April  1,  1976  these  Lackland  MMPI 
norms  were  utilized  and  along  with  a  sentence  completion  test  ae 
part  of  Phase  II  testing.  The  MMPI  was  not  used  from  April  1,  1976 
to  June  15,  1976  and  fewer  unnecessary  referrals  were  made  to 
Phase  III,  the  Mental  Hygiene  Clinic.  The  AFMET  *77  program  does 
not  use  the  MMPI  for  Phase  II  but  it  is  used  in  some  selected  indi¬ 
vidual  cases  at  the  Mental  Hygiene  Clinic  for  Phase  III.  Formerly 
one  out  of  six  Phase  III  selectees  were  discharged  and  currently 
about  one  of  three.  The  MMPI  proved  ineffective  for  Phase  II  test¬ 
ing. 

BSCS;  In  November  1976,  as  part  of  Phase  II  interviewing  the 
Bloom  Sentence  Completion  Survey,  (BSCS)  was  given  to  facilitate 
establishing  rapport  between  the  trainees  and  the  enlisted  mental 
health  technicians  conducting  the  interviews.  Initially  intended 
to  serve  as  an  icebreaker  for  the  interview  and  to  provide  some 
advance  information,  it  was  found  to  take  only  about  12  minutes 
for  group  testing.  The  results  were  read  by  the  interviewer 
before  the  trainee  was  seen  and  some  key  responses  were  under¬ 
scored  and  often  referred  to  during  the  interview.  The  inter¬ 
viewers  later  became  interested  in  numerical  scoring  and  were 
instructed  in  identifying  each  response  as  positive,  neutral,  or 
negative  by  categories. 

This  test  (Dlo^m,  1975)  purports  to  indicate  both  positive 
and  negativj  aspects  of  attitudes  towards;  people,  physical 
self,  family,  psychological  self,  self-directedr.ess ,  work,  and 
accomplishment  in  addition  to  identifying  some  irritants  in  each 
subject's  life.  After  a  week  the  inter-scorer  correlations  were 
about  .90  with  rarely  more  than  4  items  scored  differently  than 
by  the  instructor.  After  a  month,  it  was  rare  for  them  to  differ 
on  more  than  two  of  the  35  scored  responses.  Scoring  time  was 
usually  less  than  seven  minutes.  For  comparison  purposes  the 
scores  of  random  trainees  were  compiled.  These  mean  scores  and 
standard  deviations  are  in  Table  8. 

The  2879  trainees  interviewed  at  Lackland  AFB  during  Phase  II 
the  AFMET  research  program  (January  1,  1976  -  June  15,  1976)  had; 
c.tiical  record  notes,  a  standardized  report  of  their  interviews  on 
an  op3can  type  sheet  (SRI) ,  a  scored  sentence  completion  test,  and 
in  most  cases  (all  those  from  2  January  1976  to  1  April  1976)  the 
computer  print-out  of  each  MMPI.  2213  of  these  trainees  were 
returned  to  duty  in  their  training  squadrons  while  666  were  sched¬ 
uled  for  further  evaluations  at  the  Mental  Hygiene  Clinic.  As  noted 
on  Table  9,  the  BSCS  composite  scores  of  the  returned  to  duty  trainees 
averaged  8.8450  while  that  of  those  referred  to  Mental  Hygiene  Clinic 
for  further  evaluation  was  -.0376.  There  were  statistically  signi¬ 
ficant  differences  on  each  subscore  between  the  group  cleared  and 
those  to  be  further  evaluated.  These  differences  as  reflected  in 
the  T-scores  were  so  lar^e  that  the  chances  of  beir  g  accidental  were 
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less  than  one  in  a  thousand.  Of  the  666  individuals  referred  to 
Phase  III,  522  who  were  subsequently  returned  to  duty  (RTD)  had 
an  average  composite  score  of  .3006  while  114  who  were  discharged 
had  an  average  composite  score  of  minus  2.0364.  The  subscores 
for  attitudes  towards  physical  self,  self-directedness  and  work 
were  significantly  different. 

Conclusions:  Of  the  psychological  tests  used  in  the  program, 
the  Kbl  functioned  about  as  well  as  predicted  as  it  identified 
approximately  one-third  of  the  basic  trainees  given  mental  health 
discharges.  The  other  two-thirds  were  referred  by  the  training 
squadrons  uirectly  to  the  Mental  Hygiene  Clinic  usually  after  the 
10th  day  of  training.  The  recommendations  made  about  the  HOI  in 
1975  (Guinn  et  al)  are  still  pertinent  with  regard  to  need  for 
revalidation,  special  scales  for  a  WAF  population  and  its  use  in 
combination  with  additional  aptitudinal  and  biographic  data  avail¬ 
able  on  all  recruits. 

The  KMPI  did  not  work  out  in  AFMET  as  a  screening  device  at 
Phase  II  level  but  with  current  Air  Force  norms  appears  to  have 
use  along  with  other  tests  on  a  selected  individual  basis  as 
part  of  diagnosis  in  Phase  III. 

The  Bloom  Sentence  Completion  Survey  proved  useful  clinically 
as  part  of  the  Phase  Tl  interviews  and  assessment.  Statistically 
it  differentiated  between  groups  of  trainees  who  did  or  did  not 
require  'urther  evaluations  and  between  those  discharged  or 
returned  to  duty.  Further  research  need  be  undertaken  regarding 
the  composite  score  of  the  seven  subtests. 

Additional  research  or.  the  HOI  is  being  conducted  by  the 
Personnel  Division,  Headquarters  USAF  which  has  given  a  version 
of  the  HOI  to  over  80,000  subjects  at  the  AFIE  stations  and  by 
analysis  of  earlier  Lackland  data  by  the  Human  Resources  Lab. 
Further  research  to  follow  trainees  through  their  enlistments, 
identify  those  with  early  separations  for  siental  health  associa¬ 
ted  causes  and  compare  their  AFMET  test  scores,  still  remains 
to  be  done  and  the  first  of  the  AFMET  subjects  will  not  complete 
their  enlistments  until  1979. 

During  the  research  year  of  the  AFMET  program  (June  1,  1975 
to  June  15,  1976)  80,732  new  arrivals  were  given  the  HOT  throuqh 
Phase  I  of  the  program?  of  these  5369  received  Phase  II  interview* 
and  tests  which  resulted  in  1331  being  referred  to  Phase  III  for 
further  evaluations.  Four  hundred  and  forty- tour  (444)  of  these 
basic  -rainees  were  discharged  (including  79  by  squadrons,  59  for 
medical  recsons,  and  306  recommended  by  Mental  Health  Clinic). 

During  the  operational  period  October  1,  1976  to  October  1, 
1977,  reportedly  73,666  went  through  Phase  I;  4918  were  selected 
for  Phase  II?  and,  1054  were  referred  to  Phase  III.  Five  hundred 
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and  twelve  (512)  were  recommenaed  for  discharge  through  the 
Mental  Hygiene  Clinic  {Phase  III)  and  308  Phase  II  selectees  were 
discharged  by  other  agencies  before  the  AFMET  actions  could  be 
completed.  Seventeen  were  discharged  by  the  hospital  for  mental 
health  reasons. 

The  early  identification  of  trainees  with  significant  psycho¬ 
logical  problems  facilitated  their  return  to  their  civilian  lives 
without  turthur  trauma  due  to  military  stress  and  to  termination 
of  further  Air  Force  investment  in  their  training.  It  has  been 
estimated  that  well  over  one  million  dollars  has  beer  saved  at 
Lackland  each  year  of  the  AFMET  program  plus  further  indirect 
savings  by  curtailment  of  investments  in  technical  school  train¬ 
ing  for  individuals  who  do  not  complete  normal  enlistments.  The 
AFMET  program  seems  to  have  benefited  both  the  USAF  and  the  indi¬ 
viduals. 


88 


8 


REFERENCES 


Bloom,  W. ,  Manual,  Bloom  Sentence  Completion  Test;  Student  and 
Adult.  Effectiveness  Training  Associates  'of  San  Antonio,  T3T" 
Twinleaf  Ln.,  San  Antonio,  Texas  78213,  1975. 

Bloom,  W. ,  Air  Force  Medical  Evaluation  Test.  Medical  Service 
Digest,  United  States  Air  Force  28(2)  Mar /Apr  l57T^ 

Bloom,  w. ,  Relevant  MMPI  Norms  for  Youn^  Adult  Air  Force  Trainees. 
Scheduled  for  publication  in  Journal  of  Personality  Assessment  in 
December,  1977. 


Danielson,  J.R.  ft  Clark,  J.H.  A  personality  inventory  for  induction 
screening.  Journal  of  Clinical  Psychology,  1954,  10,  137-143. 


Guinn,  N.,  Johnson,  A.L.  ft  Kantor,  J.E.  Screening  for  adaptability 
to  military  service.  Report  No.  AFHRL-TR-75-30 ,  Air  Force  Resources 
Laboratory,  May  1975. 


Jensen,  M.B.,  Adjustive  and  non-ad justive  reactions  to  basic 
training  in  the  Air  Force.  Journal  of  Sociai  Psychology,  1961, 
55,  33-41.  ~ 


LoChar,  D.,  The  MMPI,  Clinical  Assessment  and  Automated  Interpreta¬ 
tion,  Los  Angeles,  California,  Western  Psychological  Services,  1574. 

LaChar,  D.,  Sparks,  J.C.  ft  Larsen,  R.M. ,  Psychometric  prediction  of 
adaptation  for  USAF  basic  trainees.  Journal  of  Community  Psychology, 
197  4,  2(3),  268-277.  ~ 


Flag,  J.A.  Pre-enlistment  variables  related  to  the  performance  and 
adjustment  of  Navy  recruits.  Journal  of  Clinical  Psychology,  1962, 
19,  168-171. 


89 


9 


i 

1 

i  Illustrations 


Tables 


LIST  OF  ILLUSTRATIONS  &  TABLES 

1  -  AF MET - 1977  Flow  Chart 

2  -  T  Score  Conversions  for  MMPI  Scales 

Without  K  Corrections  (Female) 

3  -  T  Score  Conversions  for  MMPI  Scales 

Without  K  Corrections  (Male) 


1  -  HOI  Scores  for  Basic  Trainees 

2  -  Education  Levels  of  Trainees ,  Lackland 

MMPI  Sample 

3  -  Ethnic  Identification  of  Trainees,  Lackland 

MMPI  Sample 

4  -  Age  of  Trainees,  Lackland  MMPI  Sample 

5  -  MMPI  Means  (>  Standard  Deviations  of 

Basic  Trainees,  Lackland  AFB 

6  -  T-Conversions  for  Basic  MMPI  Scales  (Female) 

7  -  T-Conversions  for  Basic  MMPI  Scales  (Male) 

8  -  Bloom  Sentence  Completion  Survey  (Adult) 

Means,  Standard  Deviations,  and  Xnter-Scale 
Correlations 

9  -  Bloom  Sentence  Completion  Survey  Scores 

2879  Basic  Trainees 


ILLUSTRATION  I 


AFMET-1977 


ARRIVAL  DAY 
A5GMT  to 
TNG  SQ5 


j  AFMET  PHASE  I  All  TRAINEuT] 

!  Screening  Psychological  Teas  *  I 

I  I  I^93% 


AFMET  PHASE  H 
3rd  TO  5th  DAY  7%  OF  TRAINEES 


Interview  A  Testing 

VO'** 
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BASIC  TRAINING 


Table  1 

HOI  SCORES  FOR  BASIC  TRAINEES 

Lackland  Au  Fore*  Baaa.  Taxaa 

Juna  1  to  NovemOa*  7.  1975 

HOI.  High  RISK 

Cumulative 

Cumulative 

Adaption  Index 

Number 

Number 

Percent 

12  and  above 

299 

299 

0.78 

11 

181 

480 

1.25 

10 

347 

827 

2.  <5 

9 

557 

1.384 

3  59 

8  (cut  off  point)  905 

2.289 

594 

7 

1.619 

3,908 

10  14 

6 

1.825 

5.733 

14  88 

5 

3.353 

9,086 

23  58 

4 

4.938 

14,204 

36  40 

3 

4.742 

18.766 

46.71 

2 

9,118 

27.684 

72  37 

1 

4.952 

32,836 

85.22 

0 

5,693 

38.529 

100.00 

_ 
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Table  II 


EDUCATION  LEVELS  OF  TRAINEES,  LACKLAND  HHPI  SAMPLE 
AUGUST  TO  OCTOBER,  1975 


GRADE  COMPLETED  _ KALE _  FB1AL6 


ma&iEsm 

8 

0 

0 

i 

0.1 

9 

5 

.3 

0 

0 

10 

9 

.8 

12 

1.5 

11 

26 

2.3 

10 

1.2 

12 

971 

84.3 

638 

78.3 

11 

82 

7.1 

78 

9.7 

14 

35 

3.0 

38 

4.7 

15 

8 

.7 

9 

l.l 

16 

17 

1.5 

17 

2.1 

Foac  Grad, 

1 

.1 

2 

•2 

TOTAL 

1152 

805 

Table  lit. 


ETHNIC  IDENTIFICATION  OF  TRAINEES,  LACKLAND  IWI  SAMPLE 
AUGUST  TO  OCTOBER,  1975 


KALB  FEMALE 


ygoiri^r" 

"•  pgrcrgff 

fEEqueS^T  “ 

’  mY§SSBt 

ANGLO- AMERICANS 

999 

86.7 

656 

81.5 

BUCK-AM3HCANS 

104 

9.0 

116 

14.4 

HISPANIC- AMERICANS 

33 

2.9 

2 

0.2 

ORIENTAL 

12 

1.0 

22 

2.7 

OTHER/  UNKNOWN 

4 

9 

l.l 

TOTAL 

1152 

805 

Tcble  iv. 

AGE  OF  TRAINEES.  LACKLAND  HHPI  SAMPLE 
AUGUST  TO  OCTOBER,  1975 


AGE 

MALE 

FEMALE 

FEFOCOT 

ram 

rmmr?  ~ 

17 

131 

15.7 

88 

10.9 

18 

515 

44.7 

272 

33.8 

19 

201 

17.4 

128 

15.9 

20 

102 

8.9 

82 

10.0 

21 

68 

5.9 

55 

6.8 

22 

33 

2.8 

35 

4.3 

23 

28 

2.5 

41 

5.1 

2* 

16 

1.4 

25 

3.1 

25 

8 

.7 

20 

2.5 

26 

1 

.0 

21 

2.6 

27 

0 

.0 

15 

1.8 

28-34 

1 

.0 

23 

2.9 

TOTAL 

1152 

805 
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VHP!  MEANS  AMO  ST AMO ARC  DEVIATIONS 
OF  BASIC  TRAINEES,  LACKLAND  AFB 
AUGUST  TO  OCTOBER  1973 
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TABLE  VIII 


BLOOM  SENTENCE  COMPLETION  SURVEY  -  ADULT 
207  Basic  Trainees  -  Lackland  APS,  IX 
July  i  August  1975 
ITEM  ANALYSIS  OP  SEVEN  SCALES 


SCALE  NAME  SCALE  SCORES 


MEAN 

$D 

People 

-1.0483 

2.0353 

Physical  Sell 

.8019 

2.0699 

Panlly 

1.1304 

2.4978 

Psychological  Self 

.6522 

2.2618 

Self-Olrectedneaa 

1.9517 

2.1440 

Work 

.7874 

1.8581 

Accoapllshaent 

1.5459 

2. 1325 

INTER-SCALE  CORRELATIONS 


1 

2 

3 

4 

5 

6 

1  People 

1.0000 

2  Physical  Self 

.1285 

1.0000 

3  Faally 

.1704 

.1723 

1.0000 

4  Psychological 

.2010 

.3444“ 

.2124 

1.0000 

Self 

5  Self- 

.2463* 

.2602* 

.1022 

.4299“ 

1.0000 

Dlrectednesa 

6  Work 

.1902 

.1938 

.0768 

.2422 

.3491“ 

1.0000 

7  Acconplishaent 

.1285 

.0387 

.0084 

.2407 

.  3966“ 

.1817 

NOTE:  ‘Significant  at  5X  level  of  confidence 

“Significant  at  IX  level  of  confidence 
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TABLE  IX, 


BLOOM  SENTENCE  COMPLETION  SURVEY  SCORES 
USAF  BASIC  TRAINEES,  LACKLAND  A7B,  TX 
Jut  1,  1976  to  Jun  IS,  1976 


People 

Physical  Self 
Family 

Psychological  Soli 
Self  Directednsss 
Work 

Accomplishment 


NET  SCORES 


ALL 


-  2379 


n  -2213 


C 


n  -  66 


6-7437 


sd  || 

X 

sd 

i 

sd 

1.9806H 

-0.3954 

1.9232 

-1.2462 

2.0274 

2.2540 

0.9440 

2.1750 

0.3078 

2.2447 

2.4717 

1.4867 

2.3862 

0.4715 

2.5878 

2.1665 

0.3814 

2.0597 

-1.1261 

2.1087 

2.1344 

2.4550 

1.9516 

0.7462 

2.1893 

1.7902 

1.1631 

1.6627 

0.1426 

1.9665 

2.1436 

2.8102 

1.9118 

1.2222 

2.4223 

8.8450 


-0.0976 


People 

Physical  Self 
Family 

Psychological  Saif 
Self  Directedness 
Work 

Accomplishment 


NET  SCOWS 


*  P  >.03 
**  P  >.01 


0.3008 


Phase  III-Disch 

n  - 

114 

X 

sd 

-1.3509 

2.0261 

-0.7018 

2.1148 

0.3421 

2.5506 

-1.2719 

2.1246 

0.2719 

2.2096 

-0.3158 

2.0101 

1.0000 

1 

2.2084 

-2.0364 


RTD  -  Return  to  Duty 

MHC  -  Referred  to  Mental  Hygiene  Clinic 


DEVELOPMENT  OF 


A  WEIGHTED  SELECTION  SYSTEM 
FOR 

\ 

THE  aPROTC  PROFESSIONAL  OFFICER  COURSE 

by 

Lt,  Col  David  K.  Jackson 
Chief,  Education  Evaluation  Division,  AFROTC 

Mr.  M.  Meriwether  Gordon 
Education  Special -st,  AFROTC 

BACKGROUND 


In  April  1976,  a  Quality  Working  Group  composed  o£  representatives 
from  the  Air  Staff,  Air  University,  and  AFRO'C  convened  at  Maxwell  AFB, 
AL.  Its  goal  was  to  seek  alternatives  for  reversing  a  perceived  down¬ 
ward  trend  in  the  quality  of  AFROTC  commissionecs  as  reflcctod  by 
Increasing  numbers  of  comralsrionccs  with  low  AFOQT  scores  and  by  high 
attrition  rater  in  the  Undergraduate  Flying  Training  Pro grans  and  in 
technical  schools.  Hie  Quality  Working  Croup  investigated  several 
approaches  toward  iaproving  quality  and  placed  major  emphasis  on  ths 
developaent  of  a  weighted  selection  system  for  POC  admission.  Air 
University  and  AFROTC  were  tasked  to  develop  a  selection  system  for 
lap 1 men tat  ion  on  or  about  3  January  1977.  Initial  stages  in  the 
development  of  the  Weighted  Professional  Officer  Course  Selection 
System  fWPSS)  involved  an  investigation  into  the  types  of  standardized 
testa  used  to  evaluate  potential  qualification  and  various  other  factors 
that  night  influence  selection  of  quality  individuals.  It  was  decided 
that  Policy  Capture  techniques  would  be  used  to  develop  the  Weighted 
kOC  Selection  System.* 

In  October,  AFROTC  convened  a  board  of  nine  Air  Force  officers 
and  civilians  to  review  the  folders  of  500  cadets  entering  the  POC  in 
the  76-77  school  year.  The  board  members  considered  approximately 
eighty  factors  on  each  cadet  in  making  their  selections  which  were 
then  submitted  to  the  Policy  Capturing  process. 

fhe  Air  Force  Human  Resources  Laboratory  used  regression  techniques 
to  analyze  the  resulting  data  on  each  of  the  cadets  and  scores  resulting 
from  the  nine  Policy  Capture  Board  members.  Eleven  factors  were  found 


*Chri»taI,  Raymond  E. ,  Selecting  a  Harem  -  and  Other  Applications 
of  the  Policy-Capturing  Model,  PRL-TR-67-1,  March  1967. 
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to  be  significant  predictors  of  the  policy  actually  used  by  the  Policy 
Capture  Board.  These  eleven  were  then  submitted  to  a  process  known  as 
Heirarchical  Grouping  (Hier-Grp), 

Hierarchical  Grouping  Olier-Grp)  is  designed  to  reduce  a  set  of 
regression  equations  (also  called  systems  or  criteria)  computed  from 
proportional  predictor  sums  of  cross  products  matrices  to  a  single 
equation.  In  the  process,  a  taxonomy  of  regression  equations  results, 
based  on  the  similarity  between  the  systems.  The  iterative  process 
begins  with  a  given  number  of  separate  regression  equations  and,  at 
each  ruccessive  iteration,  forces  s  compromise  equation  to  be  substi¬ 
tuted  for  two  of  the  separate  oyateas.  At  each  iteration,  a  new 
cluster  of  regression  systems  is  formed  or  an  existing  cluster  Is 
enlarged  so  that,  in  the  final  step,  all  equation#;  form  a  single  biroad 
cluster.  The  criterion  for  selecting  the  two  equations  Is  specified 
at  the  start  of  the  program  and  is  used  nt  every  step  of  the  program. 

Through  the  process  of  Hier-Grp,  Human  Resources  Laboratory  was 
able  to  develop  an  equation  with  the  associated  weights  to  be  applied 
to  each  of  the  eleven  key  variables  which  when  solved  resulted  in  a 
Quality  Index  Score  (QIS)  for  each  applicant. 

Table  1  below  showa  the  eleven  variables  with  their  respective 
weights  and  a  coopuiatlun  of  a  Quality  Index  Score  based  on  a  mean 
value  for  each  variable  and  its  percent  of  the  total  score. 

TABLE  1 

FY  79  WPSS  AND  THE  ELEVEN  VARIABLES 


FY  79  WPSS 

X 

Variable 

Mean 

Wt 

Points 

Score 

AF0QT-C 

45.5 

0.1381 

6.28 

(  8.4) 

SAT 

1045.2 

0.0245 

25.83 

(34.5) 

CPA 

277.3 

0.1005 

27.87 

(37.2) 

DET-CDR 

3.2 

1.7975 

5.75 

i  7.7) 

ASTIN’  (Selectivity) 

3.5 

0.7172 

2.51 

(  3.3) 

IMTOCPA 

224.2 

0,0130 

2.91 

(  3.9) 

AFOQT-Q 

47.0 

0.0459 

2.16 

(  2.9) 

PROGRAM 

0.6 

1.5837 

0.95 

(  1.3) 

TECH-MAJ 

0.4 

2.5949 

1.04 

(  1.4) 

NR  RANKED 

36.6 

0.0222 

O.fM 

(  l.D 

DET-RANK 

14.7 

-0.0870 

-1.26 

(-1.7) 

Quality  Index  Score  •  74,33 

^Bottenberg,  R. 

A.  and 

Christal,  R.  E. , 

,  An  Iterative  Technique 

for  Clustering  Criteria  Which 

Retains  Optimum 

Predictive  Efficiency, 

WADD-TN-30,  AD  261  615 
Wright  Air  Development 

.  Lackland  AFB,  Texas, 
Division,  March  1961. 

Personnel 

Laboratory, 

•*Astin,  Alexander  W. ,  Predicting  Academic  Performance  in  College, 
The  Free  Press,  New  York,  1971, 
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These  variables  were  used  in  making  selections  for  fall  77  ad¬ 
missions  to  the  advanced  program  (Professional  Officer  Course),  All 
selections  were  made  or  confirmed  centrally  at  AFROTC,  Maxwell  APB, 
although  the  plan  was  to  establish  a  Quality  Index  Score  cut-off 
oolnt  that  would  include  about  80S  of  applicants  who  could  be  selected 
at  the  Detachment  level.  Records  of  the  remaining  202  would  be  submitted 
to  Maxwell  ATB  for  central  selection.  This  would  insure  that  the  best 
of  the  low-scoring  group  would  be  selected  overall  rather  than  just  the 
best  at  each  detachment. 


RESULTS 


The  results  of  the  new  system  to  date  are  encouraging.  Subjectively, 
the  Detachment  Commanders  perceive  the  system  as  a  valuable  tool  for 
assessing  an  applicant's  potential  for  selection.  Because  of  the 
system's  objectivity  and  face-validity,  it  hao  been  readily  accepted 
by  applicants  for  the  program.  A  significant  advantage  of  the  system 
is  that  for  the  firm  time  selections  arc  made  on  a  quantified  basis 
that  is  standardised  on  a  national  lcvc'«.  As  a  result,  the  competition 
is  truly  national  in  scope.  The  Air  University  Board  of  Visitors  placed 
its  approval  on  the  system  in  its  March  77  report  when  it  wrote:  "The 
quality  of  cadets  admitted  to  the  Junior  and  senior  year  programs  is 
substantially  enhanced  by  means  of  the  complex  instruments."" 

The  mean  scores  of  those  selected  and  pon-aelected  are  pot rayed 
in  Table  2  along  with  overall  means,  and  means  by  race  and  box.  It 
may  be  readily  noted  that: 

a.  Black  applicants  are  encountering  difficulty  In  the  competi¬ 
tion,  largely  becauac  of  their  relatively  low  standardized  test  scores. 

b.  The  selectees  are  highest  in  Quantitative  eptitude,  as 
measured  by  the  AFOQT.  (This  is  probably  because  of  the  large  proportion 
of  science  and  technology  majors  in  the  group.) 


^Ait  University  Board  of  Visitors,  Thirty-Third  Meeting,  15  Mar 
77.  Minutes  and  Report  of  Chairman,  (Dr.  Arthur  G.  Hansen). 
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TABLE  2 

ALL  AFROTC  APPLICANTS 

Caucasian —  - — -Black. - — - — Other- - - Total 


Select 

Non-Sel 

Select 

Non-Sel 

Select 

Non-Sel 

Select 

Non-Sel 

TOTAL  APPLICANTS 

3835 

313 

33? 

211 

134 

28 

4302 

552 

TECH  APPLICANTS 

1729 

46 

103 

28 

61 

6 

1893 

80 

2-YR  APPLICANTS 

1414 

191 

131 

109 

57 

11 

1602 

311 

4-YR  APPLICANTS 

2421 

122 

202 

102 

77 

17 

2700 

241 

MEAN  SAT  (CONV) 

1100. 0 

837.2 

945.4 

658.1 

1052.0 

777.5 

1086.5 

765.7 

MEAN  AFOQT-COMP 

51.27 

13.79 

25.25 

3.70 

39.07 

7.29 

48.87 

9.60 

MEAN  CPA 

285.16 

233.80 

281.92 

235.49 

289.84 

232.82 

285.06 

234.39 

MEAN  AFOQT-QUAN 

53,11 

21.17 

33.27 

10.60 

46.65 

13.54 

51.38 

16.74 

FEMALE  AFROTC  APPLICANTS 


TOTAL  APPLICANTS 

640 

67 

99 

70 

21 

5 

760 

142 

TECH  APPLICANTS 

150 

2 

17 

7 

4 

1 

171 

10 

2-YR  APPLICANTS 

352 

56 

53 

39 

11 

3 

416 

98 

4-YR  APPLICANTS 

268 

11 

46 

31 

10 

2 

344 

44 

MEAN  SAT  (CONV) 

1059.9 

660.4 

887.1 

626.7 

1042.2 

798.8 

1036.9 

743.0 

MEAN  AFOQT-COMP 

38.90 

9.43 

14.55 

3.07 

32.19 

7.40 

35.54 

6.23 

MEAN  CPA 

302.02 

238.33 

301.88 

242,63 

307.48 

247.40 

302.15 

240.77 

MEAN  AFOQT-QUAN 

43.71 

15,57 

26*16 

10.07 

46.71 

19.20 

41.51 

12.99 
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TABLE  2  (Cont’d) 


HALE  AFROTC  APPLICANTS 

— Caucasian —  - — — Black — —  — — -Other - — — Total— 


Select 

Non-Sel 

Select 

Non-Sel 

Select 

Non-Sel 

Select 

Non-Sel 

TOTAL  APPLICANTS 

3195 

2A6 

23A 

1A1 

113 

23 

3542 

410 

TECH  APPLICANTS 

1579 

AA 

86 

21 

57 

5 

1722 

70 

2-YR  APPLICANTS 

1062 

135 

78 

70 

46 

8 

1186 

213 

A-YR  APPLICANTS 

2133 

111 

156 

71 

67 

15 

2356 

197 

MEAN  SAT  \CONV) 

1108.0 

830.8 

970.0 

673.7 

1053.8 

772.9 

1097.2 

773.6 

MEAN  AFOQT-COMP 

53.75 

14.97 

29.78 

4.01 

40.35 

7.26 

51.74 

10.77 

MEAN  CPA 

28i. 79 

232.56 

273.A7 

231.94 

286.56 

2;9.65 

281.39 

232.19 

HEAN  AFOQT-QUAN 

55.00 

22.69 

36.28 

10.86 

46.64 

12.30 

53.49 

18.04 

? 
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There  is  some  evidence  of  problems  with  Che  system  as  indicated 
by  the  graph  in  Figure  1.  The  frequ«ncy  distribution  of  Quality  Index 
Scores  falls  into  the  classic  bell  curve,  rising  and  falling  in  almost 
perfect,  stair-case  increments  until  it  arrives  at  QIS  63— which  was 
the  cut-off  point.  There  is  a  drastic  rise  in  frequency  Just  above 
63  and  an  equally  drastic  drop  just  below  it.  The  reason's  for  this 
asymmetry  are  not  known;  however,  ths  cause  is  under  continued 
analysis. 


Figure' 1 

l  FREQUENCY  OF  QUALITY  INDEX  SCORE 
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The  rise  In  overall  quality  as  a  result  of  the  first  application 
of  the  new  system  is  clearly  demonstrated  la  Figure  2.  The  rise  in  the 
Mean  Officer  Quality  Sccj-e  from  <.1.5  in  Academic  Year  76-77  to  48.8  in 
Academic  Year  77-78  (7.3  prints)  uas  smaller  than  anticipated  but  does 
uot  tell  th«s  entire  s<tory.  More  significant  is  the  redistribution  of 
frequencies  within  the  groun  resulting  in  a  drop  among  those  with  scores 
below  25  from  35. 8%  to  21.72.  Moreover,  it  is  now  certain  that  those 
with  low  scores  remaining  have  other  mitigating  traits. 

Figure  2 

UPPER  AND  LOWER  OQC  QUARTILE 
X  DISTRIBUTION  +  OVERALL  MEANS 

AY  76-77  (AS  300)  OQC  WPSS  SELECTEES  AY  77-78 

DROPS 
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AY  77-78  MEAN  -  48.8 


35*.  8X 

. . . 

19.82 

AY  76-77  K*AN  *  41.5 


? 


*■-  *** 


108 


tea*.  «££»**{$ 


Table  3  looks  at  distribution*  of  Cumulative  Grade  Point  Averager* 
for  the  WP5S  selectees  as  compared  to  the  AY  76/77  group.  AS  300  further 
reveals  that  the  distribution  of  higher  CPA's  has  increased  while  the 
distribution  of  lower  CPA's  has  decreased.  This  is  clearly  another  indication 
that  the  WPSS  is  enhancing  the  quality  of  cadets  who  enter  the  POC  this 
falser  the  AS  300  class  of  AY  77/78. 

TABLE  3 

COMPARISON  OF  GROUPED  GPA  DATA  ON  76-77  KNROiXEES  VS  WPSS  SELECTEES 
76/77 

AS  300  WPSS  Selectees 


f 

X 

f 

X 

3.75  - 

4.  GO 

107V 

168  3>i: 

2S3  / 

3.5 

205 

4.$V 

3.50  - 

3.74 

5.4 

341 

8’2  4?>X 

3.25  - 

3.49 

9.1 

445 

10. 7 

3.00  - 

3.24 

436/ 

14.1 

669 

16. 

2.75  - 

2.99 

492 

15.9 

679 

16.4 

2.50  - 

2.74 

563 

18.2 

680 

16.4 

2.25  - 

2.49 

510 

16.5 

576 

13.9 

2.00  - 

2.24 

420 

13.6 

430 

10.4 

1.75  - 

1.99 

92 

3.0 

97 

2.3 

1.50  - 

1.74 

17 

.5 

23 

.5 

1.49 

_ 6 

.2 

_ 5 

,1 

309*. 

100 

4150 

99.9 

In  July  1977,  a  new  selection  board  was  conducted  to  test  and  vali¬ 
date  the  findings  ol  the  first  board  as  Applied  through  the  systoa.  The 
same  care  was  taken  to  Insure  that,  board  roersbers  were  representative  of 
all  aspects  of  ths  AFROTC  Program.  Selections  for  the  fall  class  had— 
for  the  aost  part— already  been  made,  but  records  of  a  representative 
random  sample  of  all  applicants  were  furnished  the  board  as  if  the 
selections  were  yet  to  be  Bade.  The  hoard  racjnbero  were  not  given  the 
Quality  Index  Scores  but  were  given  all  the  data  from  which  the  scores 
%<ere  confuted.  They  were  also  given  copies  of  the  applicant's  college 
transcripts,  and  PAS's  were  allowed  to  provide  u  letter  in  the  applicant's 
behalf  if  they  so  desired.  The  findings  of  the  hoard  were  again  submitted 
to  regressions  and  Hierarchical  Grouping  (Hicr-Crp)  procedures. 


5Nle,  Korean  H.;  Hull,  C.  lladlai;  Jenkins,  Jean  G.;  Stcinbrenner , 
Karin;  Brent,  Dale  H.,  Statistic. 1  Package  for  the  Social  Sciences. 

2nd  Edition,  McGraw-Hill  Book  Co.,  1975. 
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The  new  weights  and  the  influence  they  exerted  on  the  total  score  In 
comparison  with  the  old  ones  are  listed  in  Table  4: 


TABLE  4 


FY  79  SYSTEM  COMPARED  WITH  77  REVALIDATION  RESULTS 


• 

- jry 

79  WPSS- 

X 

-77  B0AR1 

Z 

variable 

Mean 

Wt 

Points 

Score 

Wt 

Points 

Score 

AFOQT-C 

45.5 

0.1381 

6.28 

(  8.4) 

0.1698 

7.73 

(  9.3) 

SAT 

1054.2 

0.0245 

25.83 

(34.5) 

0.0249 

26.25 

(31.5) 

CPA 

277.3 

0.1005 

27.87 

(37.2) 

0,0894 

24.79 

(29.8) 

DET-CDR 

3.2 

1.7975 

5.75 

(  7.7) 

6.2485 

20.00 

(24.0) 

AST  IN 

3.5 

0.7172 

2.51 

(  3.3) 

0.7614 

2.66 

<  3.2) 

ROTC-GPA 

224.2 

0.0120 

2.91 

(  3.9) 

0.0046 

1.03 

(  1.2) 

AFOQT-Q 

47.0 

0.0459 

2.16 

(  2.3) 

0.0378 

1./8 

(  2.J.) 

PROGRAM 

0.6 

1.5837 

0.95 

(  1.3) 

-0.2926 

-0.18 

( ~.:o 

TECH-MAJ 

0.4 

2.5949 

1.04 

(  1.4) 

2.9426 

1.18 

(  1.4) 

Nr  RANKED 

36.6 

0.0222 

0.81 

(  1.1) 

0.1136 

4.16 

(  5.Q) 

DKT-RANK 

U.7 

-0.0870 

-1.28 

(-1.7) 

-0.4152 

-6.10 

(-7.3) 

The  new  weights  presented  some  anon.il ion  and  raised  sone  serious 
probleets.  First,  the  drastic  change  in  the  influence  of  the  Detachment 
Coanandcr's  rating  Iron  7.?X  to  24  .OX  seemed  to  amount  to  a  complete 
shift  of  policy  f ton  the  fiisl  board  to  the  second.  Moreover,  the 
new  weight  afforded  the  Detachment  Commander's  rating  was  largely 
drawn  fron  the  weights  previously  assigned  to  the  SAT  and  GP A- -measures 
store  nearly  v,,iid  across  institutional  lines. 

Second,  the  slightly  tsegative  weight  assigned  the  ‘'Program" 
variable  seeded  illogical  even  though  negligible.  AFR0TC  can  hardly 
afford  to  penalise  enrol  lees  in  its  fout-year  program  even  by  ldf tOOths 
of  a  point. 

For  these  reasons,  AFROTC  decided  to  adhere  to  the  weights  derived 
from  the  firs:  board  and  to  Ignore  the  findings  of  the  second.  The  only 
change  will  be  to  round  the  weights  to  two  decimal  places  instead  of  the 
present  four  and  to  compute  future  Quality  Index  Scores  to  only  two 
decimal  places.  This  will  considerably  simplify  the  computation  and  will 
cause  only  a  minimal  change  among  those  selected.  The  modified  system 
to  be  implemented  1  November  1977  is  displayed  In  Table  3. 
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TABLE  5 


FY  79  POUR  AND  TWO  DIGIT  COMPARED 


— - 

—4  DIGITS 

- - 

-2  DIGIT- 

Z 

Z 

VARIABLE 

Mean 

Wt 

Points 

Score 

Wt 

Points 

Score 

AFOQT-C 

45.5 

0.1381 

6.28 

(  8.4) 

0.14 

6.37 

(  9.2) 

SAT 

1054.2 

0.0245 

25.83 

(34.5) 

0.02 

21.08 

(30.4) 

CPA 

277.3 

0.1005 

27.87 

(37.2) 

Q.  10 

27.73 

(39.9) 

OCT  CDR 

3.2 

1.7975 

5.75 

(  7.7) 

1.80 

5.76 

(  8.3) 

ASTIN 

3.5 

0.7*72 

2.51 

(  3.3) 

0.72 

2.52 

(  3.6) 

KOTC-CPA 

224.2 

0.0130 

2.91 

(  3.9) 

0.01 

2.24 

(  3.2) 

APOQT-Q 

47.0 

0.00459 

2.16 

(  1.9) 

0.05 

2.35 

(  3.4) 

PROGRAM 

0.6 

1.5837 

0.95 

(  1.3) 

1.58 

0.95 

(  1.4) 

TECH-MAJ 

0.4 

2.5949 

1.04 

(  1.4) 

2.53 

1 .04 

(  1.5) 

Nr  RANKED 

36.6 

0.0222 

0.81 

(  Kl) 

0.02 

0.73 

(  1.0) 

DET  RANK 

14.7 

-0.0870 

-1.28 

(-1.7) 

-0.09 

-1.32 

(  1.9) 

Q1S  - 

74.83 

QIS  - 

69.45 

AFROTC  will  continue  to  monitor  and  adjust  the  system  as  necessary. 
The  preliminary  indications  arc  encouraging,  however,  the  real  value  of 
the  system  will  not  he  known  for  several  years;  when  attrition  rates 
for  UFT,  missile  school  and  other  tech  schools  can  bo  compared  with 
accessions  of  new  officers  entering  active  duty  under  these  criteria. 
AFROTC  ia  sure,  however,  that  this  syateai  will  produce  a  better  Air 
Force  officer.  The  Weighted  Professional  Officer  Course  Selection 
System  represents  an  equitable  and  workable  method  of  selecting  future 
Air  Force  officers.  The  results  to  date  indicate  that  improvements  in 
quality  have  been  made. 
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Military  'Testing  Association  Conference 
17-21  October  1977 

Fitness  Assessment  fear  Entry  into  the  Service  - 
A  TVre  Edged  Sword? 

Dennis  M.  Kowal,  CPT,  MBC,  Ph.D.  and  James  A.  \to9el,  Ph.D. 
US  Army  Research  Institute  of  Environmental  Medicine 
Natick,  Massachusetts  01760 


The  Armed  Forces  are  considering  supplementing!  the  present  entry 
medical  examination  with  an  evaluation  of  physical  fitness  (stamina  and 
muscle  strength) .  Thus,  new  enlistees  would  be  required  to  meet  a 
minimum  standard  for  entry  into  the  service  plus  an  appropriate  stan¬ 
dard  for  particular  job  specialty  assignments.  This  proposal  is  based 
an  the  growing  concern  for  the  high  attrition  during  initial  training 
and  the  inability  of  seme  enlistees  to  physically  perform  their  military 
job  specialties  after  assignment  to  their  units.  On  the  other  hand, 
these  is  also  a  concern  that  additional  screening  at  enlistment  will 
further  aggravate  the  anticipated  shortfall  in  Armed  Forces  enlistment 
requirements.  In  response  to  the  above  requirement,  the  US  Army  Research 
Institute  of  Envircrmental  Medicine  has  developed  methodology  for  the 
assessment  of  physical  fitness  (work  capacity)  that  would  be  suitable  for 
use  in  Armed  Forces  Entrance  Examination  Stations.  Vie  now  plan  to  eval¬ 
uate  the  usefulness  of  this  fitness  best  battery  in  a  pilot  study  which 
will  determine  ita  ability  to  predict  physical  performance  in  the  service. 
Our  ultimate  objective  is  to  improve  our  personal  selection  and  classi¬ 
fication  procedures  for  entry  into  the  services  without  sexual  discrim¬ 
ination. 

Our  proposed  physical  fitness  test  battery  includes  the  following 
oceponents:  a)  stamina  (cardiopulscrvirv  endurance  or  aerobic  fitness) ; 
b)  mincie  strength  of  upper  torso,  trunk  and  legs;  c)  muscle  strength 
endurance  of  upper  torso  and  trunk;  an!  d)  exordination. 

The  selection  of  these  measures  was  baaed  on:  a)  their  reliability 
of  measurement;  b)  face  validity  as  components  involved  in  task  per¬ 
formance;  c)  simplicity  and  ease  of  measurement;  d)  minimal  equipment? 
and  e)  suitability  for  predicting  actual  physical  performance  in  the 
job  situation. 

Stamina  or  aerobic  fitness  is  measured  with  the  use  of  a  throe  load 
stepping  best  to  predict  maximal  cocygen  uptake  from  the  exercise  heart 
rate  response  (1,2).  The  test  consists  of  stepping  at  a  rate  of  30 
steps  per  minute  at  3  of  4  possible  step  heights  (depending  on  height 
of  subject) :  4,  8,  12  and  16  inches.  Stepping  continues  far  3  minutes 
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at  each  step  and  proceeds  immediately  to  the  next  step  without  rest. 
Heart  rate  is  recorded  electrocardiographically  with  paper  stick-on 
electrodes  and  a  cardiotachcmeter  with  a  digital  meter  display.  The 
heart  rates  at  the  end  of  each  load  are  applied  to  a  nanogram  to  pre¬ 
dict  maximal  oxygen  uptake  based  on  the  near  linear  relationship  between 
heart  rate  and  oxygen  uptake  and  the  age-maximal  heart  rate  relation  (3) . 

A  device  has  been  designed  and  built  after  the  design  of  Aamussen 
(4)  and  Hermanaen  (5)  which  is  used  to  make  i  sane  trie  strength  measures 
of  all  three  muscle  groups  previously  mentioned.  The  device  employs 
cable  tensiometers,  a  spring- tension  device  that  indicates  kilograms 
of  force  placed  on  it  by  a  cable  attached  between  it  and  the  muscle 
group  of  the  subject.  For  the  upper  torso  (shoulder-arm)  muscle  group, 
the  subject  is  placed  in  a  sitting  position  secured  with  a  lap  belt,  and 
grasps  an  overhead  bar  so  that  his  elbow  makes  a  90  degree  bend  and  the 
arm  is  parallel  to  the  floor.  The  bar  is  connected  to  the  cable  tensio¬ 
meter  to  record  exerted  force.  For  the  leg  muscle  group,  the  subject 
remains  in  the  sitting  position,  and  his  legs  flexed  at  90°  at  the  knee 
with  his  feet  pushing  against  a  bar,  also  connected  to  a  tensiometer. 

For  the  trunk  extensor  muscle  group,  the  subject  stands  facing  an  up¬ 
right  bar  containing  a  brace  plate  and  a  shoulder  strap  connected  to  a 
tensiometer.  The  subject  bends  back  against  the  shoulder  strap.  All 
three  muscle  group  strengths  are  recorded  by  having  the  subject  exert 
maximally  (iacmetrically)  for  3  seconds.  Three  trails  are  performed. 

Muscle  strength  endurance  will  be  measured  with  bait- leg  sit-ups. 
The  maximal  nurber  of  sit-ups  that  can  be  performed  in  a  60  second 
period  will  be  recorded.  Upper  torso  strength  endurance  (shoulders  and 
arms)  will  be  tested  with  the  flex-arm  hang.  In  this  test  the  caximum 
time  is  recorded  that  the  subject  can  hang  suspended  from  a  bar,  with 
their  chin  above  th»  bar,  hands  grasping  the  bar. 

Whole  body  or  aim-leg  coordination  will  be  evaluated  with  a  ladder 
cliirb  test.  The  maximum  rutrber  of  rungs  traversed  of  a  10  foot  vertical 
ladder  in  2C  seconds  is  recorded. 

Additional  data  to  be  collected  along  with  the  battery  arc: 
a)  body  size  and  body  fat  estimation;  b)  physical  activity  history;  and 
c)  a  self  refxart  on  responses  to  demanding  situations. 

A  pilot  study  to  evaluate  this  test  battery  will  be  conducted  from 
January  to  June  1978  at  the  Training  Center  at  Fort  Jackman,  SC.  The 
purpose  of  the  study  is  to:  a)  evaluate  the  ability  of  the  proposed 
physical  ftcnwc  tost,  battery  to  predict  subsequent  physical  performance 
during  basic  ,'BT)  ar*5  advanced  individual  training  (ATT) ;  and  b)  which 
can  be  used  for  M06  assigrment  purposes  ( jeo  prof  iling) .  Each  component 
of  the  battery  will  be  evaluated  for  its  predictive  validity  for  the 
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criterion  referenced  performances.  Tne  first  criterion  performance  will 
be  the  new  Amy  Physical  Fitness  Test  which  is  administered  during  BT 
and  AIT.  This  is  a  3  event  test:  1  mile  run,  sit-ups,  push-ups.  The 
second  criterion  performance  measure  consists  of  the  Army  basic  ocamon 
soldiering  tasks:  5  mile  road  march,  dig  erplaoement,  50  pound-50  meter 
lift  and  carry,  hand  grenade  throw,  75  meter  crawl,  75  meter  interrupted 
rush.  Other  data  to  be  collected  will  include  sick  call  incidence, 
recycling  and  drop-cut  information  and  military  performance  measures. 

The  study  design  includes  the  initial  evaluation  of  lo 00  new  basic 
trainees  on  the  fitness  test  battery  during  the  fill  week  of  basic 
training.  Criterion  performance  measure  data  will  be  collected  as 
these  individuals  are  followed  through  basic  and  advanced  individual 
training.  The  Array  fitness  test  is  performed  at  the  middle  and  end  of 
BT  and  AIT.  The  common  soldiering  tasks  will  be  performed  at  the  end 
of  BT  and  AK*.  Upon  completion  of  the  data  collection,  a  multiple  re¬ 
gression  analysis  will  be  performed  to  specify  the  examination  of  factors 
(items  on  the  test  battery)  that  best  differentiate  between  successful 
vs  unsuccessful  individuals  in  BT  and  AJT. 
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ABSTRACT 

This  paper  acknowledges  that  there  are  diverse  sources  of  urror 
which  Must  be  controlled  in  order  for  aptitude  teste  to  have  substan¬ 
tial  validity,  discusses  aany  of  the  sources,  and  describes  a  highly 
cost-effective  procedure  for  iaaediate  verification  of  the  verldlcallty 
of  operational  teat  scores. 


INTRODUCTION 

There  are  both  lasting  and  taaporary,  general  and  specific,  charac¬ 
teristics  that  cause  the  aptitude  test  score  an  individual  attains  to 
vary  from  his  theoretical  true  score.  For  purposes  of  prediction  in  se¬ 
lection  and  classification  through  the  use  of  testing,  all  reasons  that 
would  increase  this  variance  over  a  group  aay  be  considered  error. 

Such  seal -permanent  influences  as  the  ability  to  deal  with  instruc¬ 
tions  on  tests,  or  general  examinee  strategies  for  answering  test 
questions,  vary  widely  with  Individuals.  The  services  have  used  several 
means  as  attempts  to  reduce  error  attributable  to  this  "test  wisenass". 
Instructions  are  easy  to  understand  and  are  targeted  to  low  levels  of 
reading  ability,  and  'ample  test  ltea»  and  sample  instructions  are  pro¬ 
vided  in  an  Information  pamphlet  intended  to  familiarise  everyone  con¬ 
cerned  with  the  nature  of  the  test. 

Temporary  Influences  on  test  scores  may  also  affect  measurement.  A 
person's  physical  and  emotional  condition,  and  the  physical  testing 
environment  may  cause  variation  from  true  scores.  To  reduce  these  tem¬ 
porary  effects  that  add  to  measurement  error,  care  is  taken  to  excuse 
from  the  testing  session  persons  who  are  clearly  ill  or  excessively 
fatigued,  or  persons  who  are  disturbing  others;  and  there  are  regula¬ 
tions  that  prohibit  testing  long  hours  without  breaks  or  testing  in  places 
without  proper  conditions  of  lighting  and  temperature. 
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Scoring  uid  recording  errors  occur  sichsr  as  transitory  human 
errors  or,  at  tines,  as  semi-permanent  conditions  when,  for  example, 
an  undetected  Malfunction  develops  in  equipawmt  used  to  score  teats. 
Generally  the  variety  of  scoring  aids  now  used  in  Armed  Forces  Ex¬ 
amining  and  Entrance  Stations  (AFEES),  including  optical  scanning, 
provide  not  only  error  reduction,  but  tine  savings  as  well. 

Another  source  of  measurement  error  is  test  compromise.  These 
measurement  errors,  rather  than  being  randomly  distributed,  usually 
operate  in  one  direction,  to  yield  overestimates  of  qualifications. 
Although  compromise  probably  does  not  affect  the  measureamnt  of  very 
large  numbers  of  enlistees  as  could  other  measurement  errors,  its  non- 
random  character  makes  test  security  of  great  importance. 


In  the  past,  the  most  common  means  of  coping  with  test  compromise 
has  been  by  use  of  alternate  test  forms.  There  are  two  different  types 
of  alternate  forms,  and  they  differ  in  cost  of  production  and  in  the 
kind  of  protection  they  provide.  One  type  uses  the  same  items,  but 
arranged  in  different  sequences  in  different  test  booklets.  This  type 
remedies  situations  in  which  the  compromise  has  taken  the  form  of  ex¬ 
aminees  being  provided  with  a  key  to  the  correct  answers,  but  not  the 
content  of  those  answers  (for  example:  la,  2c,  3d,  etc.).  This  type  of 
compromise  is  believed  to  be  relatively  uncomsun.  The  other  type  of 
alternate  test  form  is  very  much  more  costly  to  produce  but  also  very 
much  more  ccsprehenslve  in  its  protection.  It  consists  of  two  tests  with 
similar  (but  not  Identical)  content,  matched  in  difficulty  and  other 
statistical  properties.  The  protection  afforded  is  not  Just  for  cases 
having  the  key,  but  for  cases  having  the  full  answers  to  one  of  the  forms. 
Both  of  the  two  typei  of  alternate  test  form  are  now  used  in  the  test 
quality  control  programs  of  the  services. 

The  parallel  forms  approach  provides  reasonable  protection,  but  at 
an  extresMly  high  cost  uf  production.  That  approach  also  does  not,  in 
and  of  Itself,  identify  cases  of  suspect  scores. 

The  present  paper  describes  an  alternative  approach  to  test  quality 
control,  one  which  involves  minimum  test  development  costs  as  well  as 
minimum  examining  tiam  on  site. 

APPROACH 


The  objective  of  this  dsvslopment  was  to  provide  an  'Operational 
tool  to  detect  a  substantial  percentage  of  enlistment  qualification  tsst 
compromise  esses.  The  general  strategy  was  to  capltualize  on  what  ia 
known  or  can  be  deduced  logically  concerning  the  differential  compro¬ 
mise  vulnerability  of  the  various  parts  of  che  battery  (ASVAB) ,  and  to 
combine  that  information  with  known  statistical  relationships  among  the 
subtests  as  as  to  "flag"  highly  unusual  »ccre  patterns  for  subsequent 
followup. 
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Operational  experience  has  shown  that  the  main  target  for  coat- 
pronise  has  been  the  AFQT  portion  of  the  test  battery.  AFQT  has  been 
in  joint  services  use  the  longest,  for  soaw  of  the  services  AFQT  is 
the  principal  selection  standard,  and  the  nature  of  its  contents — 
vocabulary,  arithmetic  problems,  and  geometric  figures — are  generally 
the  best  known  of  all  military  tests. 

Within  the  AFQT  portion  of  the  battery,  experience  has  indicated 
that,  if  compromise  fakes  place,  the  compromise  involves  the  vocabulary 
items  by  far  and  away  the  most  frequently  of  all.  This  is  not  sur¬ 
prising  inasmuch  as  vocabulary  words  are  easy  to  remember  and  to  look 
up  after  the  examination.  The  other  two  subtexts  do  not  lend  themselves 
to  this  kind  of  compromise;  the  arithmetic  problems  are  relatively  long 
prose  paragraphs  and  there  is  no  readily  available  source  of  the  right 
answers  as  there  is  with  a  dictionary,  and  the  totally  pictorial  test  of 
spatial  relations  is  nearly  impossible  to  compromise  through  memory,  and 
again,  there  is  no  "dictionary"  available. 

Given  (1)  that  Word  Knowledge  may  be  the  key  ASVAB  subtest  compro¬ 
mised,  that  (2)  the  other  components  are  relatively  hard  to  compromise, 
and  that  (3)  the  psychometric  relationships  among  these  subtests  are 
stable  and  known:  LIKELY  COMPROMISE  CAN  BE  DETECTED  BY  COMPARING  DIS¬ 
CREPANCIES  IN  SCORE  BETWEEN  THE  WORD  KNOWLEDGE  SUBTEST  AND  ONE  OR  BOTH 
OF  THE  OTHER  AFQT  COMPONENTS  (Arithmetic  Reasoning,  Space  Perception). 


IMPLEMENTATION 

The  numeric  values  needed  to  begin  to  implement  the  logic  of  this 
approach  were  derived  from  a  national  sample  of  1,000  AFEES  applicants 
drawn  in  January  1976.  These  1,000  cases  were  stratified  on  AFQT  to 
conform  t*/  the  standard  mobilization  reference  population,  and  the  sta¬ 
tistics  shown  in  Table  1  were  obtained.  As  may  be  seen,  the  correlation 
of  Word  Knowledge  (WK)  with  Space  Perception  (SP)  is  0.43.  This  means 
that  fairly  sizeable  score  discrepancies  between  WK  and  SP  can  be  ex¬ 
pected  Just  by  chance.  On  the  other  hand,  the  correlation  oi  WK  with 

Arithmetic  Reasoning  (AR)  is  high  enough  to  be  usable,  0.68.  /a  an 
aside,  the  correlation  of  WK  with  the  sum  of  AR  and  SP  is  no  higher  than 
the  WK/AR  correlation  and,  if  used  operationally,  obtaining  the  AR/SP 
sum  would  involve  an  extra  hand  coaqmtatlon  at  operational  testing  sites. 
As  a  result,  development  focused  on  use  of  the  WK/AR  discrepance  as 
being  technically  as  sound,  and  operationally  more  feasible. 

The  intention  was  to  develop  a  procedure  which  wotild  "flag",  as 
suspicious,  cases  in  which  the  WK  score  exceeded  the  AR  score  by  more 

than  an  expected  number  of  points.  The  regression  line  of  AR  on  WK 

can  predict  the  expected  WK  score  from  any  AR  score.  The  prediction 
has  tolerance  bounds  defined  by  the  standard  error  of  estimate  of  WK  on 
AR  and  the  level  of  significance  (alpha)  selected.  A  one  -  tailed 
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alpha  i*v*l  of  p  3  0.16  (i.e.,  on*  sigma)  was  choaan  In  consideration 
of  maximizing  detectability  for  subsequent  followup.  This  resulted  in 
identifying  11  score  points  as  the  slxe  of  the  WK/AR  discrepancy  which 
would  "flag"  unusual  cases.  That  is,  knowing  that  WK  and  A&  correlate 
0.68,  in  a  fair  test  e  very  snail  percentage  of  individuals  (fewer 
than  16%)  would  bo  expected  to  exhibit  a  WK  score  11  or  more  points 
larger  than  their  AR  score.  Those  who  do,  Bust  be  considered  unusual. 

A  group  exhibiting  the  unusual  score  pattern,  consists  of  two  types 
of  individuals,  (1)  those  for  whoa  the  abilities  measured  by  the  WK 
subtest  are  truly  mil  in  excess  of  their  abilities  in  the  donalns  mea¬ 
sured  by  AR,  and  (2)  these  whose  WK  scores  are  artificially  Inflated 
through  some  breech  of  test  security.  The  next  step  then,  is  to  sort 
these  types  apsrt. 

The  simplest  way  to  sort  the  compromise  cases  from  the  genuine, 
though  unusual,  ones,  is  to  administer  a  10-«lnuta  retest  consisting  of 
known  wscure  WK  items,  and  to  compare  performance  on  the  WK  reteat  with 
performance  on  the  original  WK.  For  soae  of  these,  the  original  WK  score 
will  replicate,  plus  or  minus  a  calculable  chance  effect;  for  some,  the 
second  WK  score  will  be  so  much  lower  as  to  be  virtually  unexplainable 
through  normal  chance  variation. 

Just  as  the  initial  screen  utilize  5  the  values  shown  in  Table  1  to 
define  the  critical  WK/AR  difference,  values  in  the  same  Table  plus  those 
in  Table  2,  were  used  to  set  the  chance  limits  for  ths  WK  1/WK  2  differ¬ 
ence.  For  this  step  s  relatively  low  alpha  level  (p*0.01)  was  set  to  mini¬ 
mize  the  risk  of  false  accusation  and  to  identify  casts  virtually  unax- 
plainable  by  the  hypothesis  of  chanca  variation.  When  raw  scorss  art 
converted  to  percentages  so  as  to  coutrol  for  the  different  test  lsngths, 
the  critical  (one-sided)  difference  of  WK  1  minus  WK  2  was  determined  to 
be  26  points,  and  individuals  txhlbitlng  a  lsrgsr  differtttet  srs  iden- 
tlfied  isa  most  likely  having  recsivsd  Improper  pretest  assistance. 

EMPIRICAL  TEST 

In  the  Spring  of  1976  a  sample  of  111  enlistees  who  had  bs*r.  tested 
with  ASVAB-6  at  AFEES  was  rstssted  at  the  Ft.  Jackson,  SC,  Reception  Sta¬ 
tion  with  ACB-73.  ACB-73  contains  WK  and  AR  subtests,  and  was  ths  Army's 
basis  for  computing  AFQT  scorss  until  replaced  by  ASVAB-6  and  -7  in 
January  1976.  At  the  time  the  test  sample  was  drawn,  ACB-73  was  no 
longsr  oporational  and,  hence,  its  WK  subtsst  could  be  locked  upon  as  com¬ 
pletely  secure. 

The  first  step  in  the  test  was  to  calculate  the  oue-sided  difference 
of  ASVAB-6  WK  minus  AR,  and  to  refer  it  to  the  specified  critical 
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difference  of  il  points.  \y 

The  second  step  was  to  calculate  the  one-sided  difference  of 
ASVAB-6  V8C  ttinua  ACB-73  WK,  and  refer  that  difference  to  the  specified 
critical  difference  of  26  percentage  points.  This  procedure  identi¬ 
fied  11  of  the  20,  or  55Z^aa  highly  suspect  compromise  cases.  These 
and  other  important  relationships  are  summarized  in  Table  1.  As  may  be 
seen,  if  100X  of  the  sample  had  been  retested  on  the  secure  WK,  23  cases 
would  have  bean  identified  as  highly  suspect.  Retest  of  only  18?  of  this 
sample  identified  11  of  the  23,  or  48Z;  that  is,  retest  of  fewer  than 
20Z  of  the  sample  "caught"  almost  half  of  the  compromise  cause. 

A  final  empirical  test  was  performed  to  assure  suxiouu  certainty 
of  the  percentage  of  the  input  which  would  have  to  bo  retested  under 
the  rule  of  WK-AR  —  10  points.  It  may  be  recalled  that  10  points  im¬ 
plements  an  alpha  level  of  a  little  larger  than  0.16  —  i.e.,  a  little 
over  167  of  th*.  population  "flagged"  fer  retesting  —  and  on*  sample,,  at 
Ft.  Jackoon,  yielded  18Z  so  "flagged."  In  mid-1976,  another  sample  of 
AFEKS  date  was  drawn,  of  size  500,  and  the  WK  minus  AR  criterion  was 
agair  applied.  Results  in  this  sample  "flagged"  17Z  of  the  cases. 

SUMMARY  AND  CONCLUSIONS 

In  recognition  of  the  fact  that  the  Word  Knowledge  subtest  is  the 
most  vulnerable  to  compromise  of  all  the  tests  in  the  selection  and 
classification  battery,  a  simplified  procedure  was  developed  to  detect 
WK  compromise.  The  procedure  has  tve  steps: 

1.)  AC  the  time,  of  acoriug  ehe  AFQT  portion  of  the  battery, 
separate  chose  papers  in  which  the  AR  raw  score  is  lees  that  15,  and 
the  WK  raw  score  is  10  or  more  points  greater  than  that  AR  score. 

Tills  step  will  "flag",  as  potentially  suspect,  some  15Z  to  20Z. 

2)  To  only  chose  "tlegged"  by  step  one,  administer  a  10-ainute 
retest  consisting  of  a  completely  secure  WK ,  convert  raw  scores  to 
percentage  correct  if  necessary,  and  separate  those  papers  in  which 
the  WK  retest  score  is  at  least  26  percentage  points  lower  than  the 
original  VK.  score  (checklist  tables  can  easily  be  prepared  to  accom¬ 
plish  all  conversions  and  all  comparisons  with  critical  differences). 

This  combination  of  setpe  will  identify,  as  highly  suspect,  approxi¬ 
mately  h«lf  of  ail  cases  of  likely  test  compromise. 

■A/  Actually  tve  modifications*  were  aw»d r.  Fir  t.  for  obvious  admin¬ 
istrative  simplicity,  the  il  •,  oint  critical  difference  was  changed  to 
10.  Second,  rognxtaa.ee  also  had  to  be  taken  of  the  difference  in  te«t 
length  between  WK  and  AR.  Thus,  the  full  statement  of  the  rule  became 
that  the  \k  score  be  vndsr  15  and  the  difference  be  10  polntc. 


1  ?' 


TA2LE  3 


RESULTS  OF  EMPIRICAL  TEST 
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_ QUALITY  CO^STEPS^ 

o  WHEN  SCORING  AFQT  PARIS,  INSPECT  FOR 
ARClSAfCVK-AR^rlO 
THIS  "FLAGS"  15  -  20X  OF  INPUT 
0  TO  THOSE  TLAGGEI IT,  ADMINISTER 
10-MINUTE  SEO!£  HK  FEWEST 
0  INSPECT  FOR 

9  26  PERCENTAGE  POINT  DfDP  MCI  -  HK  2 
0  THIS  OOreiNATION  OF  STEPS  IDENTIFIES  APPROXIMATELY 
HALF  OF  ALL  LIKELY  OWTOIISE 
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An  alternative  to  th«  tvo»step  procedure  la  to  administer  the  VK 
retest  to  everyone  and  apply  the  rule  of  a  26  percentage  point  drop. 
This  will  detect  twice  the  number  of  cos^roaise  cases,  but  at  five  to 
seven  times  the  cost  (that  is,  retesting  1002  of  AFEES  applicants 
instead  of  between  IS  and  202). 

Another  alternative  is  to  enlarge  the  requisite  WK/AR  difference 
no  as  to  retest  10  percent  of  the  input.  In  our  Ft.  Jackson  sample, 
this  detected  30  percent  of  the  likely  compromise  caeca. 

For  any  of  thasa  alternatives,  the  conclusion  may  be  drawn  that  a 
simple  and  cost~beneflclal  procedure  for  enhancing  quality  control  ir. 
the  testing  of  military  applicants  has  bo«n  developed. 
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CANADIAN  MILITARY  CDWEGE  ATTRITION  AND  SUCCESS 
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Introduction 


Recruitment  and  selection  by  an  armed  for*,  combined  with 
self-selection  by  the  volunteering  candidate* ,  tend  to  provide  applicants 
to  the  military  officer  profession  who  have  attitudes  congenial  to  the 
military  establishment.  Nonetheless,  candidates  disenrol  and  fail  from 
military  colleges  in  sufficient  nutters  to  be  viewed  with  concern  loy 
military  authorities.  Such  drop-outs  represent  failures  in  the 
transition  to  offioer  status  via  the  academies.  The  problem  has  been 
present  throughout  the  existence  of  military  colleges,  however  recently 
the  attrition  rate  has  been  alarming.  RadWay  (1971)  reported  that  33% 
of  an  entering  class  at  West  Point  fail  to  graduate.  USAFA  reported  a 
251  attrition  rate  (Radway,  1S71)  and  Annapolis  a  33%  attrition  rate 
(Abrahams  and  Neumann,  1973) .  Formerly,  the  major  reasons  were 
academic,  but  have  recently  been  motivational.  Attrition  is  greatest 
in  the  first  yuar. 

In  Canada,  the  situation  is  worse.  Classes  entering  the 
Canadian  Military  College  system  in  the  mid-sixties  (1964—1967) 
approximated  380  cadets  per  year,  of  which  approximately  230,  or  60%, 
voluntarily  tusenrcllcd,  failed  or  were  released  prior  to  graduation 
(Officer  Career  Development  Plan,  1972) .  In  the  1972-74  period,  the 
attrition  rate  for  the  Royal  Military  College  was  about  501.  As  in 
the  American  academies,  the  largest  nuttoer  of  failures  and  resignations 
occurred  in  the  firs-  year  of  sponsorship  and  were  mainly  motivational 
disenrol  lees. 

Most  of  the  cadets  who  leave  the  Canadian  Military  Colleges 
before  being  commissioned  can  be  classified  as  voluntary  disenrollees, 
academic  failures  and  military  failures.  Conversely,  within  the 
approximate  50%  of  officer  cadets  who  do  graduate  and  are  commissioned, 
varying  degrees  of  military  excellence  are  observed  and  assessed  during 
the  training  in  the  military  colleges. 

Rs search  was  undertaken  to  examine  the  usefulness  of  the  Strang 
vocational  Interest  Blank  (SVIB)  both  for  reducing  the  attrition  of 
military  college  cadets  and  for  predicting  military  excellence  of  thoee 
cadets  retained  in  the  system. 
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Retention  in  the  Military  Colleges 

Research  on  retention  in  the  military  colleges  is  only  a  small 
portion  of  the  total  research  on  retention  in  the  military.  Two 
recent,  oorprehensiue  reviews  of  the  literature  applicable  to  retention 
in  the  military  were  published  by  Culclasure  (1971)  and  Tuttle  and 
Hazel  (1974) .  Culclasure  referred  to  extensive  research  conducted  by 
naval  personnel  researchers  who  evaluated  the  SVIB  to  determine  its 
efficacy  in  resolving  retention  problems. 

Abrahams,  Ne unarm  and  Githena  (1968a,  1968b)  evaluated  the 
success  of  the  SVIB  for  selecting  naval  officer  cadet  applicants  most 
likely  to  remain  on  active  duty  beyond  the  minimum  obligatory  period. 
Their  analysis  yielded  an  enpirical  retention  scale  that  had  relatively 
high  validity,  high  reliability,  and  low  fakability.  Neumann  and 
Abrahams  (1971)  and  Abrahams  and  Neumann  (1971)  also  evaluated  the  S'/IB 
as  a  predictor  of  career  motivation,  motivational  diaenrolment  and 
academic  failure  of  naval  officer  cadets  undergoing  college  educations 
in  civilian  universities.  Scales  for  each  category  of  disengagement 
were  derived  that  were  acceptable  for  operational  use. 

The  major  sources  of  research  an  retention  in  a  military  college 
are  the  three  major  American  military  academies.  Additional  research 
has  been  completed  in  the  Canadian  military  colleges.  Throughout, 
approaches  and  techniques  vary,  reflecting  the  irdependcrice  of  operations 
among  these  institutions. 

Abrahams,  Neumann  and  Denn  (1969)  devised  a  souring  key  from  the 
SVIB  to  differentiate  motivational  disenrcllees  and  remaining  U.S.  Naval 
Acs deny  midshipmen  following  the  initial,  non-aoBdcmic,  military  smtner 
training  program  at  the  academy.  Item  analysis  revealed  u.  iters  of 
items,  inclviding  sports,  autonomy,  leadership  and  aesthetic  interests, 
among  others,  that  differentiated  the  two  groups.  An  empirical  scale 
was  devised  to  identify  midshipmen  most  likely  to  voluntarily  uisenrol. 
Abrahams  and  Neunsnn  (1973)  subsequently  validated  the  SVIB  for 
predicting  not  only  disenrolnent  but  also  military  aptitude,  and  three 
separate  interest  scales  wore  empirically  developed  to  predict 
motivational  drsenrubrent ,  academic  disenrolment  and  military  aptitude 
of  naval  academy  applicants.  The  results  of  this  research  clearly 
supported  the  conclusions  that  the  SVIB  was  a  valid  predictor  of  mid¬ 
shipmen  success,  that  scores  on  a  derived  scale  were  significantly 
related  to  wddshipmen  military  aptitude  rr ‘rings,  and  that  the  SVIB 
scales  significantly  aided  in  identifying  those  candidates  nost  likely 
to  disenrol.  Additional  research  conducted  with  the  SVIB  was  aampletod 
by  Abrahams,  Nevmann  and  Githens  (1970) ,  Neumann  and  Abrahams  (1914) , 
Wolfe  (1971) ,  Sands  (1975) ,  and  Sands  and  MoCullah  (1974) . 

West  Point  researchers  have  investigated  various  facets  of  the 
cadet  success  and  failure  enigma  with  varied  results:  adnissions 
information  (Longo,  1966);  cadet  interests  and  needs  (Fishbume,  1967); 
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high  school  faculty  ratings  (McLaughlin,  1971a,  1971b) ;  personality 
differences  (does  and  Cortex,  1971)  j  vocational  interests  (Marzon, 
1971a) ;  military  attitudes  (Mar ran,  1971b) ;  adknission  criteria  (Butler, 
1973,  and  Butler  and  Boustor,  1974);  and  adjustment  reactions  (U'Ran, 
1974) .  USAFA  has  a  similar  record  of  diversity  in  their  research,  e.g. , 
ad justment  problems  (Lachar .  1974).  In  1974,  the  Air  Force  acadeny 
administered  the  SVIB  to  academy  applicants,  after  which  the  naval 
academy  SVIB  disenrolment  scale  (from  Abrahams  and  Neumann,  1973}  was 
scored.  The  reported  results  (from  personal  communication)  were  highly 
statistically  significant  in  predicting  voluntary  resignations. 

Published  research  on  cadet  attrition  in  the  Canadian  military 
colleges  has  originated  almost  totally  from  Royal  Military  College: 
high  school  academic  marks  (Carpenter,  1969) ;  personality  factors  (Bain, 
1972,  1973;  Carpenter,  1973,  1974);  biographical  questionnaire  (Shields, 
1973) .  The  conclusions  were  that  the  prediction  of  either  distinction 
car  attrition  was  an  elusive  undertaking. 

This  overview  of  the  literature  on  military  college  cadet 
retention,  excellence,  and  diaenmlment  indicates  that  the  area  has  been 
investigated,  but  without  great  success.  The  single  psychometric 
instrument  apparently  most  successfully  employed  in  specific,  military 
environments  is  the  SVIB.  Scales  derived  from  its  item  bank  have 
predicted  retention  in  the  military  college. 


Method 


The  two  aims  of  this  study  were: 

(1)  To  develop  empirical  scales  from  the  items  of  the 
SVIB  to  determine  their  potential  for  predicting 
vocational  stability  (attrition  or  retention)  and 
vocational  performance  (mediocrity  or  excellence) 
in  a  military  college. 

(2)  To  test  the  validity  of  these  empirical  scales  through 
a  cross-validation  study  on  a  new  sample. 


Sarrples 

Two  samples  of  cadets  were  used,  one  for  validation  procedures 
and  one  for  cross-validation.  Tha  first  sample  consisted  of  400  cadets 
who  entered  Royal  Roads  Military  College  (RHMC) ,  Victoria,  B.,C. , 

Canada  in  1968,  1969  and  1970  with  the  expectation  of  oaraploting  two 
years  at  plus  two  years  at  Foyal  Military, College  (PMC),  Kingston, 


Ontario,  Canada  in  order  to  graduate  from  J*C  in  1972,  1973,  and  1974 
respectively.  The  second  sample  was  266  cadets  who  entered  RFMC  in 
1971  and  1972,  expecting  to  graduate  from  PMC  in  1975  and  1976 
respectively.  All  cadets  were  administered  the  SVIB  on  entry. 

The  Instrument 

The  Strong  Vocational  Interest  Blank  is  an  erpirically  derived 
inventory  developed  by  E.K.  Strong  shortly  after  World  War  I.  In  its 
present  form  (Carpbell,  1969),  the  SVIB  for  Men  consists  of  399  items 
drawn  from  several  areas  of  life  —  occupations,  school  subjects, 
amusements,  kinds  of  people,  work  situations,  etc.  There  are  54 
occupational  scales,  22  Basic  Interest  saales  and  8  nor»-oocupational 
scales  in  the  SVIB. 

Validation  Procedure 

In  order  to  develop  empirical  scales  that  would  discriminate 
between  groups ;  it  was  first  nooessary  to  define  reference  and  criterion 
groups.  For  the  final  status  or  disposition  categories,  the  task  was 
not  difficult.  The  4i>9  cadets  in  the  validation  sanple  were  classified 
as  graduated  on  schedule  (N-167) ,  voluntarily  disenrolled  (hk«104) , 
academically  failed  <f4—37) ,  and  militarily  failed  0=38) .  Fifty-four 
(IP. 5%)  of  the  sanple  did  not  fit  into  these  categories  and  were  removed 
from  the  analysis.  In  the  ocxvttruction  of  the  scales,  the  graduates 
.constituted  the  reference  group  and  volunta-ry  diaenrollees  or  academic 
failures  or  military  failures  constituted  the  criterion  groups,  each  in 
separate  analyses. 

For  the  scales  constructed  to  discriminate  excellence  or 
mediocrity  in  vocational  performance,  only  those  cadets  reaching  fourth 
year  were  used  in  the  analyses.  Performance  was  assessed  by  academic 
grades,  military  grades  find  cadet  officer  appointment  level .  The 
academic  mediocrity /excellence  categories  consisted  of  a  reference  group 
with  third  class,  pass  or  borderline  academic  grades  (T>*82)  and  a 
criterion  group  with  second  and  first  class  grades  (N"88) .  The  military 
grades,  determined  by  cormissioned  officers  whose  primary  responsibility 
was  the  care,  rranagement  and  assessment  of  squadrons  of  cadets,  permitted 
the  classification  of  cadets  into  mediocrity  and  excellence  categories. 
These  consisted  of  a  reference  group  with  military  grades  of  C-,  C  or 
C+  (N*118)  «nd  a  criterion  group  with  grades  of  A  or  B  (N“45). 

The  cariet  wing  is  partially  self-regulated  by  a  cadre  of  fourth 
year  cadets  assigned  to  "cadet  officer  appointment"  positions.  The 
cadet  holding  the  highest  position  is  called  the  Cadet  Wing  Commander 
(air  force  rank  terminology)  and  wears  five  stripes  or  bars  on  his  tunic, 
hence  holds  a  5-bar  position.  Bar  positions  exist  at  the  5,  4,  3  and  2- 
bar  levels.  Fourth  year  cadets  rw»t  selected  for  cadet  officer  appointments 
receive  no  bars.  Therefore,  for  the  cadet  offiaer  appointment  level  or 
"bars"  categories,  the  reference  group  consisted  of  fourth  year  cadets 
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assigned  0  or  2  bars  (N»98)  and  the  criterion  group  consisted  of  fourth 
year  cadets  assigned  3  or  4  bars  (t*77) .  No  cadets  in  the  sarple 
received  5  bars. 

Considerable  controversy  exists  in  the  psychometric  literature 
as  to  the  appropriate  scale  length  for  scales  intended  for  discriminating 
between  groups.  Too  few  items  (e.g. ,  less  than  40)  reduce  reliability 
while  too  many  items  enphasize  chance  difference*  or  create  unnecessary 
redundancy  (Caepfcell,  1971).  Campbell  suggested  a  scale  length  of  not 
more  than  100  items  and  preferably  not  less  than  60,  but  if  necessary, 
a  minimum  of  40.  In  this  project,  to  ensure  that  sufficient  items 
were  used  while  at  the  same  time  selecting  only  items  that  discriminated 
between  groups  with  percentage  differences  of  acceptable  magnitude 
(  >  10%,  Abrahams  and  Neumann,  1973),  it  was  decided  that  two  scales  for 
each  of  the  six  criteria  would  be  constructed  —  a  50-item  and  a  75-item 
scale.  Therefore,  12  enpirioal  scales  were  constructed. 

The  precise  method  for  selecting  and  then  weighting  discriminating 
items  has  varied  with  new  developments  and  new  instruments  in  psychometrics. 
In  an  exceptionally  thorough  investigation.  Sands  (1975)  developed  and 
evaluated  many  Item  response  weighting  procedures.  For  the  problems 
addressed  in  his  study,  almost  all  of  the  different  weightier  methods  had 
essentially  the  same  ability  to  differentiate  between  grtn»s.  Sands 
suggested,  therefore,  that  the  simplest  procedures  for  weighting  continue 
to  be  used. 

In  this  project,  items  showing  differences  in  percent  responses 
of  >  10%  between  the  reference  and  criterion  group*  were  identified  and 
given  unit  weights  in  the  appropriate  direction.  The  itom  were  then 
dimensional i  zed  by  weighting  the  opposite  end  of  the  items  in  the  reverse 
direction.  The  Indifferent  response  was  weighted  in  the  appropriate 
direction  (+1  or  -1)  if  the  percentage  difference  between  the  criterion 
and  reference  group  was  about  10%  or  more.  Otherwise,  it  was  given  a 
weight  of  0.  Constants  of  50  and  75  were  added  to  the  50-item  and  75- 
item  scales  respectively  to  convert  all  scores  to  positive  values  in 
scale  ranges  from  0-100  and  0  -  150. 

Validation  of  the  ocalos  was  investigated  by  applying  the  scales 
to  the  scale  development  sartple.  A  oanparison  of  reference  grovp  and 
criterion  group  scores  for  each  scale  yielded  biserial  correlations 
which  indicated  the  scale's  validity  for  predicting  the  particular 
criterion  being  assessed.  As  well,  expectancy  tables  were  constructed 
to  demonstrate  graphically  the  magnitude  of  relationship*  between  the 
scale  scores  and  criterion  variables, 

Cross-validation  Procedure 

This  second  phase  constituted  a  cross-validation  study  on  a  new 
sarple  to  test  the  validity  of  any  relationships  found  above  (i.e. , 
validity  generalization) .  The  oerparison  of  predicted  and  actual  outcomes 


in  this  now  sample  determined  whether  the  empirical  scale  development 
method  could  be  used  to  reduce  attrition  losses  in  the  military  college 
system. 
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Results 


Validation  Phase 

Empirical  Scales.  Table  1  provides  descriptive  statistics  for 
the  12  empirical  scales  constructed.  &me-namsd  scales  (in  the  order 
listed  in  Table  1)  of  50- item  and  75- item  lengths  correlated  .950, 

.950,  .916,  .951,  .969  and  .952,  all  p  <  ,001,  indicating  that 
considerable  redundancy  and  little  benefit  resulted  from  the  difference 
in  scale  lengths. 

Discriminating  Capacity.  The  ability  of  the  scales  to 
discriminate  between  the  criterion  group#  arid  the  reference  qroups  was 
assessed  by  calculating  biserial  correlations  as  estimates  or  the 
relationship  between  the  scale  scores,  distributed  continuously,  and 
the  criterion  and  reference  groups,  which  together  constituted  a 
dichotomous  variable.  Bi serial  correlations  were  used  instead  of  point 
biserial  correlations  (Abrahams  and  Neumann,  1973)  since  the  variables 
underlying  the  dichotomies  were  assured  to  be  continuous  and  normal 
(e.g.,  a  oontinuun  of  predisposition  h  voluntarily  diaenrol) , 

For  all  12  scales,  the  biserial  correlations  were  significantly 
different  from  0.  Values  ranged  from  .722  to  .935,  which  indicated  that 
the  scale  scores  were  distributed  significantly  differently  between 
criterion  and  reference  groqpe.  These  results  were  not  surprising, 
however,  since  the  scales  were  being  applied  to  the  same  sanple  used  to 
develop  them.  This  procedure  artificially  inflated  the  statistics, 
reflecting  a  relationship  that  could  be  expected  to  shrink  in  a 
cross-validation  sample. 

The  expectancy  table  in  Figure  1  was  prepared  as  an  exanple  to 
demonstrate  graphically  the  magnitude  of  the  relationships  between  the 
scales  and  the  criteria.  The  expectancy  table  was  constructed  by  dividing 
the  scale  score  distributions  into  quintiles  and  assessing  these  quintiles 
as  to  their  proportion  of  caSeu  from  the  reference  group  and  the  criterion 
groep.  Figure  1  depicts  the  increasing  proportion  of  voluntary  dis- 
enrollees  found  in  each  quintile  of  the  scale  score  distribution. 

Following  this  example  further,  for  the  75-item  voluntary'  disenrolment 
scale,  voluntary  disenrollees  in  the  lowest  through  highest  quintiles 
constituted  approximately  3%,  10%,  24%,  31%  and  65%  of  the  quintiles’ 
population.  Only  one  of  the  twelve  expectancy  tables  was  presented  here 
since  the  trends  were  very  similar  across  all  scales. 
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FIGURE  1 

Tables  Depicting  Likelihood  of  Voluntary  Disonrolroent 
Agaociattd  with  Scoraa  on  the  Voluntary  Diaanroxment  Empirical 

Scales;  validation 
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Cross-validation  Phase 

The  ability  of  the  empirical  scales  to  discriminate  between 
criterion  and  reference  groups  in  the  cross-validation  sanple  was 
indicated  by  the  magnitude  of  the  biserial  correlations.  For  6  of 
the  12  scales,  the  biserial  correlations  were  significantly  different 
from  0  {Table  2) .  The  two  voluntary  disenrolment  scales  possessed  the 
highest  discriminating  abilities,  the  two  bars  scales  the  next  highest. 

The  two  military  performance  scales  possessed  discriminating  abilities 
that  just  readied  significance.  The  two  academic  failure  scales,  the 
two  military  failure  scales  and  the  two  academic  performance  scale** 
failed  to  discriminate  significantly  between  criterion  and  reference 
groups.  Table  2  contains  criterion  and  reference  group  means,  biaerial 
correlations  and  significance  levels  for  all  twelve  scales.  The 
"shrinkage"  anticipated  from  applying  the  scales  developed  in  the 
validation  swple  to  the  cross-validation  sanple  did  occur. 

Due  to  several  considerations  not  presented  thoroughly  in  thin 
paper  (e.g. ,  lack  of  discriminating  ability  aa  indicated  in  expectancy 
tables  for  military  performance,  the  redundancy  in  scales  in  that 
military  performance  scales  and  bars  scales  were  essentially  measuring 
military  success,  the  fact  that  the  bars  scales  were  discriminating 
better-  etc.) ,  tlw  decision  was  made  to  drop  the  '.oiiitary  performance 
scales  from  further  consideration.  This,  in  essence,  meant  that,  for 
predicting  attrition,  the  two  voluntary  .iiaenrolment  scales  would  be 
used  and,  for  predicting  military  excellence  errong  graduates,  the  two 
bars  scales  ’would  be  used.  The  expectancy  tables  for  these  scales  are 
presented  in  Figures  2  and  3. 

For  thezte  four  remairung  scales,  item  -  scale  score  correlations 
were  calculated  to  identify  those  items  not  contributing  significantly 
to  the  scale  totals.  It^ms  not  significantly  correlated  were  removed 
from  the  scales,  producing  rx*w  scales  of  shorter  length  (renamed  VODI45, 
WOI55,  a\RS4Q  and  BAftS57  to  reflect  scale  length).  Biserial 
correlations  calculated  for  the  as  abbreviated,  "purified"  scales 
indicated  that  the  discriminating  abilities  of  these  scales  were  not 
significantly  -nproved. 

In  the  expectancy  tables  for  the  revised  voluntary  fli3enrolment 
scales  (Figure  4),  all  distributions  in  all  quintiles  ware  within  a  few 
percentage  points  of  the  sanw  distributions  before  the  scales  ware 
revised  (Figure  2).  For  the  revised  brrs  scales,  the  distributions  in 
the  quintiles  appeared  slightly  improved  in  tluat  the  extreme  cases  (0  bars 
and  4  bars)  were  better  distributed  across  the  quintiles.  (Ctnpare 
Figures  3  and  5). 

Tests  of  internal  consistency  were  conducted  using  split-half 
reliabilities  to  determine  if  revising  the  scales  through  r  -naval  of 
items  not  correlated  with  scale  scores  had  improved  the  scales.  For 
the  VOOI45  arri  V0DI55  scale*,  oorrel  between  test  halves  equalled 

.478  and  .582  that,  when  corrected  with  the  Spearman  -Brown  formula, 
equalled  .647  and  .735,  p<.001,  oonpared  to  .592  and  .567  before 
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Bara  50  50.5  46.3  8.2  .323 

Bara  75  78.1  72.7  10.8  .310 


FIGURE  2 


Kxpectency  Tables  Depicting  Likulm  >od  of  Voluntary  Disenrolment 
Aasociated  with  Scores  on  the  Voluntary  Disenrolment  Empirical 

ScaloB;  Cross validation 
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FIGURE  3 


with  Sc-oi  ea  on  the  Bars  Empirical  Scales:  Croaavalidatj 
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revision.  It  appeared  to  be  tie  case  that  scale  reliability  was 
inpruved  by  the  scale  revisions,  as  indicated  by  the  reliability  co¬ 
efficients,  however,  tiie  predictive  validity  of  the  scales  did  not 
significantly  change. 

The  potential  advantage  in  using  the  scales  for  selection 
could  be  axerplified  using  the  voluntary  disenroleent  likelihoods 
outlined  in  Figure  4.  These  candidates  scoring  in  the  top  quintile  of 
the  V0DI55  scale  were  4  1/2  times  more  likely  to  disenrol  than 
candidates  in  the  bottan  quintile.  For  Figure  5,  Candida tas  scoring  in 
the  top  quintile  in  the  BARS40  scale  were  3  times  more  likely  to 
achieve  3-bar  or  4-bar  cadet  appointmjnts  than  those  scoring  in  the 
bottom  quintile. 

Tto  identify  the  psychological  traits  underlying  the  differences 
in  scale  scores  between  criterion  and  reference  groups,  factor  analyses 
of  the  scale  items  were  conducted.  A  principal  factors  with  Varimax 
orthogonal  rotations  factor  analytic  technique  (Nie,  et  al.,  1975) 
was  carried  out  with  the  items  in  the  VODI  and  BARS  scales.  The  factor 
accounting  for  the  largest  percentage  (29.61)  of  variance  in  the 
voluntary  disenrolment  scale  contained  8  items  concerned  with  artistic/ 
aesthetic  interests  (e.g.,  art  museum  director)  which  were  endorsed  as 
"liked"  by  voluntary  disenrol  lees.  The  second  factor,  containing  8 
items  representing  engineer ing/mxhani cal.  interests  (e.g.,  electronics 
equipment  designer)  and  the  third  factor,  which  included  military 
activities  (e.g.,  military  drill) ,  both  consisted  of  items  that  were 
responded  to  in  the  negative  by  voluntary  disenrollees.  The  fourth 
factor  contained  items  reflecting  acceptance  of  nonconformity  in 
others  (e.g.,  like,  as  people,  dayrireamsrs,  beachcombers)  to  which 
voluntary  disenrollees  responded  positively.  The  fifth  factor  included 
items  representing  military  types  (e.g.,  General  of  the  Army)  to  which 
voluntary  disenrollees  responded  negatively.  Total  variance  accounted 
for  by  tba  first  five  factors  was  85.4%.  In  all,  the  candidates  opting 
for  early  disenrolment  appeared  to  have  interests  that  were  more 
artistic  or  aesthetic,  less  mechanical,  less  oriented  toward  military 
activities  and  military  people,  and  more  liberal  toward  nonconformity. 

The  five  clusters  of  items  differentiating  militarily 
excellent  cadets  from  mediocre  cadets,  according  to  the  bars  scales, 
were  all  endorsed  more  positively  by  excellent  cadets  than  mediocre 
cadets.  The  first  cluster  included  social-artistic  interests  (e.g., 
dramatics,  psychologist)  and  the  second  cluster  represented  athletic 
and  recreational  leadership  (e.g.,  athletic  director).  The  third 
cluster  involved  social  responsibilities  (e.g.,  starting  an  activity) 
while  the  fourth  cluster  consisted  of  items  on  teaching  and  directing 
others  (e.g.,  teach  children,  scout  loader).  The  final  cluster  involved 
items  reflecting  enterprising  endeavours  (e.g.,  stockbroker,  corporation 
lawyer).  Collectively,  these  factors  accounted  for  82.9%  of  the 
variance,  and  indicated  that  cadets  who  excelled  in  the  military  college 
system,  generally,  had  core  interests  in  aesthetic,  athletic,  social 
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end  enterprising  activities,  and  like  to  teach  and  direct  others.  The 
converse  appeared  to  be  that  mediocre  cadets  had  Interests  in  mechanical 
but  not  aesthetic  areas,  were  less  interested  in  athletics,  were  more 
reserved,  ware  not  enterprising  and  preferred  not  to  beach  or  direct 
others.  Hindsight  allowed  the  observation  that  these  factors  basically 
identified  the  characteristics  desired  in  a  strong  leader*  an  artistic, 
creative,  flexible  orientation;  physically  active  and  involved; 
extroverted  and  capable  of  infusing  life  into  an  organization;  interested 
in  teaching  or  instructing  others;  and  possessing  an  enterprising  manner 
and  ability  to  influence  others. 


Conclusions 


The  atpirical  scales,  developed  to  predict  attrition  or 
retention,  and  excellence  or  mediocrity  for  retained  cadets,  showed  a 
strong  capacity  to  predict.  The  scales  designed  to  predict  probabilities 
for  voluntary  disenrolment  did  successfully  differentiate  between  cadets 
who  did  voluntarily  resign  and  cadets  who  stayed  and  graduated.  As  well, 
separata  scales  differentiated  between  mediocrity  and  excellence  in 
cadat  performance  cs  indicated  by  the  level  of  cadet  officer  appointment 
(bars)  assigned  in  fourth  year.  This  research  supports  the  conclusion 
that  the  SVTB  is  a  valid  predictor  of  military  college  attrition  or  of 
retention  with  excellent  perfonranoo.  On  the  basis  of  those  results, 
it  is  reccrtmcnded  that  the  S'/IB  bo  further  investigated  for  its  full 
potential  to  predict  early  disenrolment  and  military  excellence! 
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An  extensive  literature  exists  concerning  construction,  validation, 
ard  use  of  mental  tests.  In  view  of  this  fact,  there  is  a  surprising 
lack  of  research  concern  irg  the  actual  building  blocks  of  objective 
trental  tests— test  items.  Weaman  (1971)  has  pointed  out  that  texts  on 
measurement  are  filled  with  rules  of  thumb  for  construction  of  test 
items,  yet  little  empirical  research  exists  which  validate  these  rules. 
Only  sparse  information  is  available  concerning  effects  of  violating 
these  rules  for  item-writing  or  the  strictness  with  which  these  rules 
should  be  followed. 

One  aspect  of  best  item  construction  which  has  received  sane 
attention  in  the  H  t*  item  cl  using.  A  multiple-choice  item 

is  clued  when  structural  or  grammatical  faults  aid  unknowledgeable 
examinees  to  find  the  correct  answer.  Several  studies  have  examined 
grama ti cal  mismatches  between  incorrect  answers  and  the  stem  and  items 
in  which  the  correct  answer  is  much  longer  or  shorter  than  the  dis- 
tr actors  (Board  t  Whitney,  1972,  and  McMorris,  Brown,  Snyder,  &  Purzoll, 
1972,  for  exanple).  The  results  o'i  this  research  have  been  inconsistent. 

Rules  concerning  test  item  formats,  such  as  those  suggested  by 
Adkins  (1974),  have  also  received  some  study.  Several  investigators 
have  studied  items  which  contain  responses  such  as  "none  of  above"  or 
"all  of  above"  (Rimland,  1960;  Hughs  s  Trimble,  1965;  Williamson  fc 
Hopkins,  1967) .  Items  containing  alternatives  such  as  these  tend  to 
be  more  difficult,  but  do  not  differ  from  control  items  in  other  respects. 
Furthermore,  these  results  have  been  inconsistent.  Board  and  Whitney 
(1972)  and  Vi tola  and  Cantrell  (1961)  have  studied  open-stem  items— 
items  in  which  responses  oonplnte  the  stem,  as  contrasted  with  items  in 
which  the  stem  is  a  grammatically  ccrtplete  question.  In  both  of  these 
studies,  open-stem  items  were  found  to  be  more  difficult  than  closed- 
stem  items.  Cieutat  (1960)  found  that  rote-memory  items  had  both  higher 
reliabilities  and  validities  tlan  application  items. 

As  can  be  seen,  the  research  results  concerning  format  rules  for 
multiple-choice  test  items  have  been  both  sparse  and  inconsistent.  Hie 
purpose  of  the  present  study  is  to  provide  further  information  concerning 
the  usefulness  of  several  rules  that  have  been  proposed  for  test  item 
formats.  Four  item-writing  rules  are  examined  here.  These  item  formats 
to  be  tested  are  (a)  multifactor  items — items  in  which  the  correct  answer 
contains  several  parts,  (b)  situational  items — items  which  require  appli¬ 
cation  of  knowledge  to  real-life  situations,  (c)  negative  items— items  in 
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which  the  examinee  trust  select  the  one  incorrect  answer,  ratl.er  than 
the  one  correct  answer,  and  (d)  open-stem  items.  Dependent  variables 
i examined  are  item  difficulty  and  reliability.  For  each  item  which  was 
constructed  to  have  one  of  the  experimental  formats  in  this  study,  a 
control  item  was  written  covering  parallel  content  but  which  was  single¬ 
factor,  nonsituational,  positive,  and  cloeed-stem.  Therefore,  item 
content  was  controlled  in  tests  of  the  experimental  formats.  This 
d’f'ign  allowed  a  purer  test  of  the  experimental  formats. 


Method 

Twenty- five  four-alternative  multiple-choice  items  were  constructed 
using  each  of  the  four  experimental  formats.  The  content  of  these  items 
was  general  information  in  2ireas  such  as  science,  math,  English,  art, 
and  history.  For  each  experimental  it-sm,  a  control  item  was  written 
covering,  as  much  as  possible,  the  same  content.  The  control  items 
followed  none  of  the  formats  to  be  tested.  Instead,  they  were  positive, 
single- factor,  closed-stem,  and  nonsituational.  Tables  1  through  4 
contain  examples  of  experimental  and  matching  control  items  for  each 
format  tested.  TWo  100-item  test  forms  were  compiled  from  these  experi¬ 
mental  and  control  items.  Each  item  appeared  on  only  one  form.  The 
experimental  item*  for  each  format  were  divided  evenly  between  the  two 
forms.  An  experimental  item  and  its  matching  control  item  always  were 
placed  on  opposite  forms,  but  in  the  same  sequence  location. 

Each  test  form  was  administered  to  approximately  100  airmen  who 
held  grades  between  E-4  and  E-6.  The  tests  were  administered  under 
anonymous  conditions.  Therefore,  no  data  is  available  concerning  the 
age,  sex,  or  ethnic  group  characteristics  of  the  samples. 


Results 

Fbr  each  item,  the  difficulty  was  computed  by  determining  the 
proportion  of  airmen  who  selected  the  correct  answer.  Three  reliability 
estimates  were  computed  for  each  item.  First,  the  point-biserial 
correlation  was  computed  between  the  score  on  each  item  and  the  total 
score  for  the  test  form  on  which  the  item  appeared.  Then  each  test 
form  was  factor-analyzed.  On  each  test  form,  an  inspection  of  the 
eigenvalues  revealed  discontinuities  at  6  and  11  factors.  Therefore, 
varimax  rotation  was  applied  to  both  6-  and  11- factor  solutions  for 
each  test  form.  Squared  multiple  correlations  were  used  as  ocmmunality 
estimates.  Fbr  each  item,  the  factor  was  identified  in  each  rotated 
solution  with  which  the  item  had  the  largest  correlation.  These 
largest  correlations  were  used  as  additional  estimates  of  the  items* 
reliabilities. 

Therefore,  one  difficulty  three  reliability  estimates  were  avail¬ 
able  for  each  item.  Fbr  each  experimental  item  format,  an  analysis  of 
variance  (ANCVA)  was  done  with  each  of  these  four  dependent  variables. 

The  reliability  estimates  were  squared  before  being  analyzed.  The  unit 
of  analysis  in  these  ANCVAs  was  the  individual  test  item.  The  independent 


variables  were  experimental  vs  control  and  test  form.  Because 
experimental  and  control  items  were  matched  on  content  but  always 
appeared  on  different  forms,  a  Latin  square  design  was  used  in  the 
ANCMAs.  The  layout  of  this  design  is  illustrated  in  Figure  1.  Thin 
design  permits  both  a  be  tween-groups  test  and  within-groups  tests  of 
main  effects  for  the  two  independent  variables.  Interaction  effects 
could  not  be  tested  with  this  design.  Table  5  contains  percents  of 
variance  attributable  to  each  effect  for  the  various  ANOVAs  and 
degrees  of  freedom.  Out  of  48  F-ratioe  computed  in  the  ANOVAs,  only 
one  was  significant  at  the  .05  level.  This  is  probably  a  dance  result. 


Discussion  and  Conclusions 

The  present  results  indicate  that  rone  of  tie  four  test  item 
formats  examined  influence  either  difficulty  or  reliability,  after 
item  oontent  is  controlled.  Findings  of  Vi  tola  and  Cantrell  (1961) 
concerning  open-stem  items  and  of  Cieutat  (1960)  concerning  applica¬ 
tion  items  were  rot  replicated. 

The  present  results  show  that  the  four  item  formats  tested 
should  be  neither  preferred  nor  avoided  in  order  to  control  difficulty 
or  reliability.  However,  the  present  results  do  not  preclude  the 
possibility  of  format-by-oontent  interaction  effects  on  these  depen¬ 
dent  variables.  Seme  formats  may  be  more  appropriate  for  some  oontent. 
Furthermore,  other  considerations  may  dictate  preference  for  or 
avoidance  of  these  formats  in  test  construction.  For  exarple,  situ¬ 
ational  ita.e  may  have  greater  oontent  validity  than  rote  memory  items 
and  therefore  might  be  preferred  when  content  validity  is  important. 
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Table  1 


Sanple  Test  Items  -  Multifactor 


Experimental 

Ordinary  table  salt  is  made  up  of 
what  elements? 

a.  Chlorine  and  magnesium 

b.  Potassium  and  magnesium 

c.  Potassium  and  sodium 

d.  Chlorine  and  sodium 


If  2x/y  *  10,  which  of  the  fol¬ 
lowing  valves  for  x  and  y  are 
correct? 

a*  x  ■  10,  y  «  3 

b.  x  «  14,  y  «  7 

c.  x  »  15,  y  *  5 

d.  x  *  20,  y  *  4 


What  state  was  the  most  recent 
to  enter  the  union,  and  in  what 
year  did  it  enter? 

a.  Alaska  in  1959 

b.  Hawaii  in  1959 

c.  Alaska  in  1960 

d.  Hawaii  in  1960 


Control 

Ordinary  table  salt  contains 
which  one  of  the  following  elements? 

a.  Chlorine 

b.  Fluorine 

c.  Magnesium 

d.  Potassium 


If  2x/4  *  10,  what  value  for  x 
is  correct? 

a.  10 

b.  14 

c.  15 

d.  20 


In  what  year  was  the  most 
recent  state  a&nitted  to  the 
union? 

a.  1958 

b.  1959 

c.  1960 

d.  1961 


Table  2 


Semple  Ttest  Items  -  Negative 


Experimental 

Which  one  of  the  following  animals 
is  NOT  a  vertebrate? 

a.  Hunting  bird 

b.  Rattlesnake 

c.  Tuna  fish 

d.  Earthworm 


Which  one  of  the  following  plays 
was  NOT  written  by  William 
Shakespeare? 

a.  Pygmalion 

b.  The  Tempest 

c.  Julius  Caesar 

d.  The  Merchant  of  Venice 


Which  one  of  the  following  works 
of  art:  was  NOT  produced  by 
Michelangelo? 

a.  David 

b.  La  Pieta 

c.  The  Last  Supper 

d.  The  Sistine  Ceiling 


Control 

Which  one  of  the  following  animals 
is  a  vertebrate? 

a.  Octopus 

b.  Starfish 

c.  Earthworm 

d.  Rattlesnake 


Which  one  of  the  following  plays 
was  written  by  William  Shakespeare? 

a.  Faustus 

b.  Pygmalion 

c.  The  Tanpest 

d.  The  Crucible 


Which  one  of  the  following  works 
of  art  was  produced  by  Michel¬ 
angelo? 

a.  The  tona  Lisa 

b.  Venus  De  Milo 

c.  The  Last  Supper 

d.  The  Sistine  Ceiling 
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Table  3 


Sarrple  Test  Items  -  Open  Stem 


Experimental. 

The  largest  planet  in  our  solar 
system  is 


a. 

Jupiter 

b. 

Neptune 

c. 

Uranus 

d. 

Saturn 

Suffrage  is  the  right  to 

a.  petition 

b.  a^sentolo 

c.  worship 

d.  vote 


Trtdf  predominant  religion  in 
Chile  is 

a.  Episcopalian 

b.  Methodist 

c.  Lutheran 

d.  Catholic 


Control 

What  planet  is  the  largest  in 
our  solar  system? 

a.  Jupiter 

b.  J#ptune 

c.  Uranus 

d.  Saturn 


Suffrage  refers  to  what  right? 

a.  To  petition 

b.  To  assontole 

c.  To  worship 

d.  1t>  vote 


What  religion  is  predominant 
in  Chile? 

a.  Episcopalian 

b.  Methodist 

c.  Lutheran 

d.  Catholic 


n\ 


Table  4 


Sample  Teat  Items  -  Situational 


Experimental 

If  a  15-ounoe  bottle  of  dish¬ 
washing  liquid  cost  $.90,  what  is 
the  coat  per  ounce  of  the  liquid? 

a.  4  cents 

b.  6  cents 

c.  10  cents 

d.  15  cents 


A  student  wishes  to  find  informa¬ 
tion  in  a  library  an  the  current 
state  of  the  economy  in  the  U.  S. 
Which  one  of  the  following  sources 
would  bo  most  useful  in  finding 
this  information? 

a.  An  encyclopedia 

b.  The  Readers'  Guide 

c.  The  card  catalogue 

d.  An  economics  textbook 


If  a  person  planned  a  trip  to 
Brazil,  what  language  course  would 
probably  bo  most  useful  to  him  or 
her? 

a.  English 

b.  Spanish 

c.  trench 

d.  Portugese 


Control 

What  is  1/15  of  .9? 

a.  .04 

b.  .06 

c.  .10 

d.  .15 


What  material  is  contained  in 
the  Reader*! '  Guide? 

a.  Condensed  novels 

b.  Famous  quotations 

c.  Definitions  of  words 

d.  Titles  of  magazine  articles 


What  is  tlx?  official  language 
of  Brazil? 

a.  Spanish 

b.  English 

c.  French 

d.  Portuguese 
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Table  5 


ANCVA  Results  -  Proportions  of 
Variance  Attributable  to  Various  Sources 


Situational 

dlff 

Rl 

R6 

RU 

df 

Between  item  contents 

.8403 

.8379 

.5856 

.4806 

25 

item  groups 

.0617 

.0927 

.0025 

.0048 

1 

items  within  groups 

.7786 

.7452 

.5831 

.4758 

24 

Within  item  contents 

.0097 

.0003 

.0169 

.0566 

50 

test  form 

.0053 

.0004 

.0169 

.0389 

1 

format  (exp  vs  con) 

.0344 

.0000 

.0001 

.0177 

a 

i 

Residual 

.1500 

.1618 

.3975 

.4628 

48 

Open  Stem 

diff 

rl 

R6 

RU 

df 

Between  item  contents 

.9551 

.7285 

.6172 

.6752 

25 

item  groups 

.0939 

.0699 

.0434 

.0619 

1 

items  within  groups 

.8612 

.6586 

.5738 

.6133 

24 

Within  item  oontents 

.0003 

.0069 

.0642 

.0190 

50 

test  form 

.0001 

.0049 

.0642 

.0187 

1 

format  {exp  vs  con) 

.0002 

.0020 

.0000 

.0003 

1 

Residual 


0446  .2646  .3186  .3058 


48 


Table  5  (continued) 


ANOVA  Results  -  Proportions  of 
Variance  Attributable  to  Various  Sources 


Negative 

diff 

Rl 

R6 

Rll 

df 

Between  item  contents 

.7067 

.7191 

.5578 

.6440 

25 

item  groqpe 

.0297 

.0008 

.0244 

.0470 

1 

items  within  groups 

.6730 

.7183 

.5334 

.59)0 

24 

Within  item  contents 

.0685 

.0133 

.0362 

.0347 

50 

test  form 

.057.1 

.0039 

.0250 

.0264 

1 

format  (exp  vs  con) 

.0133 

.0089 

.0126 

.0096 

1 

Fesidual 

.2248 

,2676 

.4060 

.3213 

48 

Multi factor 

diff 

Rl 

R6 

Rll 

df 

Between  item  contrite 

.8591 

.8671 

.6100 

.4963 

25 

item  groups 

.0007 

.0098 

.0019 

.0257 

1 

items  within  groups 

.8584 

.8573 

.6081 

.4706 

24 

Within  item  contents 

.0110 

.0093 

.0005 

.0169 

50 

test  form 

.0000 

.0000 

.0002 

.0163 

1 

format  (exp  vs  con) 

.0110* 

.0093 

.0003 

.0007 

) 

Residual 

.  1299 

.1236 

.3900 

.4868 

48 

*p<  .05 
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Figure  1 

Analysis  of  Variance  Design 
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IimtOOUCTIOK 


I. 


Th*  purpose  of  this  report  1*  to  define  the  components  end  analysis  of 
s  mental  measursmsi.t  model  which  may  be  used  to  analyse  responses  classified 
In  one  of  several  categories  of  the  Items  of  the  measurement  Instrument.  Such 
measurement  Instruments  may  be  the  usual  testa  or  examinations  (with  extended 
response  al ternst iver)  or  quust tonnalrea.  This  situation  also  covera  performance 
measurement  when  it  la  desired  to  have  extended  categories  of  correct  or  Incor¬ 
rect  response. 

To  fulfill  the  purpose  we  first  present  some  background  material  to  the 
approach  we  uae  to  specify  the  model  and  Its  parameters.  Tha  method  used  to 
estimate  the  parameters  la  developed  as  s  general isst ion  of  s  psychometric 
estimation  procedure,  called  the  Frequency  Ratio  Method,  devised  by  tne  author 
for  analysing  the  llnary  Response  Measurement  Model.  Applications  are  suggested 
and  an  example  is  provided. 

II.  BACKGROUND 

2 . 1  Influence*  of  G.  Reach,  B.  D.  Wright  and  P.  Blommera 

After  the  publication  of  s  new  book  by  Reach  (1960)  on  measurement 
models,  this  work  received  little  attention  at  measurement  centers  end  univer¬ 
sities  of  the  U.  S.  except  at  the  Stats  University  of  Iowa  end  the  University 
of  Chicago. 


Since  Reach's  work  was  unique.  Interesting,  provocative,  end 
useful,  considerable  speculation  as  to  why  the  book  did  not  generate  more  atten-  ' 
tion  is  justified.  The  most  spectacular  aspect  of  Reach's  models  vers  their 
“specif ically  objective"  features  msaniug,  for  example,  that  e  person's  ability, 
as  estimated  from  hie  responses,  is  Independent  of  the  Items  used  to  measure  the 
ability,  and  the  easiness  or  difficulty  of  the  Items  la  Independent  of  the  popu¬ 
lations  from  which  the  persons  are  sampled.  Hence,  Wright  succinctly  refers  to 
Reach  Models  with  "Person  free  item  measurement  end  item  free  person  measurement." 

The  truth  of  this  property  has  often  been  challenged,  but  most 
challengers  forget  that  these  properties  are  easily  proved  matheswt Ically  and 
jre  empirically  true, 

a.  only  If  the  data  fit  the  measursmsnt  model  being  tented,  end 

b.  only  vlthln  measu-. ament  error  of  the  statistics  involved. 

My  thought  on  this  Issue  is  that  many  of  the  readers  of  Raech's 
work  suffered  from  "Objectivity  Shock"  meaning  that  they  did  not  consider 
Specific  Objectivity  possible  and  put  the  book  aside,  neither  Professor  Ben 
Wright  of  th*  University  of  Chicago  nor  Professor  Paul  Blommsrs  of  ths  Univer¬ 
sity  of  Iowa  wars  greatly  bothered  by  this  problem  but  did  their  own  research 
in  thlo  type  of  measurement  and  encouraged  Ph,D.  candidates  to  do  llkswiss. 

This  leadership  greet ly  popularised  Reach's  work  in  this  country  e*nd  served  to 
develop  it  further.  Wright  alwo  taught  courses  on  Rssch  Modslr  at  professional 
meetings.  Moat  of  Wright's  wo aa  well  as  that  of  hie  students  dealt  with 
Reach’s  acnievsevn'  test  model  whereas  Blonmer'e  students  worked  with  that 
model  an  well  as  Batch's  models  for  reading  speed  and  oral  reading  accuracy. 
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Wright  and  Panchapakesan  (1969)  developed  en  unconditional  maxlsum  likelihood 
estimation  procedure  and  computarised  the  analysis  for  the  Rasch  achievement 
taodel. 


2 . 2  The  Binary  and  Polychotomoua  Measurement  Module 

In  most  achievement  and  ability  teats  and  examinations,  the 
responses  to  the  test  Items  are  classified  into  one  of  two  categories — either 
"correct"  or  "Incorrect."  This  circumstance  leads  to  a  metrological  model  for 
analysis  which  may  be  termed  a  "Binary  Measurement  Model"  or  BMM.  The  Bttf  is 
also  called  a  dichotomous  model.  It  is  this  model  that  Wright  (1968),  Rasch 
(1960)  and  Moonan  (1974)  considered  and  provided  different,  but  equivalent, 
analysis  for. 


However,  there  are  many  other  response  situations  that  are  useful 
and  must  be  accommodated  and  analysed.  One  of  these  occurs  in  achievement  and 
ability  testing  and  gives  rise  to  responses,  classified  by  testees  or  psycho- 
metricians,  into  one  of  several  (r>2)  categories.  This  situation  naturally 
occurs  in  performance  measurement  and  in  attitude.  Interest,  and  motivation 
surveys  and  in  questionnaires.  In  such  cases  the  Binary  Response  Measurement 
Models  (BMO  are  not  appropriate  and  a  generalisation  is  required.  It  is  our 
purpose  to  define  an  analysis  for  such  a  model t  called  the  Polychotomoua 
Response  Measurement  Model  (PMM),  in  thlr  report.  Such  models  have  been  con¬ 
sidered  by  others:  Rasch  (1961),  Vogt  (1971)  and  Anderson  (1973).  Anderson 
has  provided  a  successful  computerised  analysis  of  the  PMM.  This  used  a  new 
and  very  complicated  mathematical  estimation  procedure  celled  "Conditional 
Maximum  Likelihood."  The  approach  used  hare  is  greatly  simplified  and  is  an 
extension  used  by  Moonan  (1974)  to  analyse  the  MH.  Vogt  msde  her  analysis 
with  unconditional  maximum  likelihood  techniques,  but  indicated  her  approach 
was  not  entirely  successful  (pemonal  communication).  Rasch  did  not  carry 
the  analysis  of  the  Pit!  very  far. 

? . 3  The  Process  of  Mental  Metrology 

We  shall  attempt  now  to  outline  the  considerations  believed 
important  for  developing  and  analysing  a  mental  metrological  instrument. 

2.3.1  The  Instrument  Coal  and  Objectives 

The  first  thing  for  the  metrologist  to  do  is  to  decide  what 
mental  quality  or  characteristic  of  persons  he  desires  to  measure.  This  is 
called  his  "metrological  goal"  and  indicates  the  general  property  that  the 
instrument  le  designed  to  measure.  Other  related  sub-characteristics  Intended 
to  be  also  measured  by  the  same  device  are  called  "objectives." 

2.3.2  The  Items  and  Their  Functions 


The  measurement  instrument  may  be  thought  of  as  a  kind  of 
large  "hurdle,"  the  success  le  surmounting  of  which  by  a  person,  measures  his 
skill  at  the  task  attempted.  Rather  than  having  one  single  hurdle  it  is 
customary  *or  the  instrument  to  have  several,  I,  related  email  hurdles,  called 
items,  each  of  which  la  designed  to 
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(a)  elicit  a  place  of  relevant  Information,  knowledge  or 
akill,  or 


(b)  provide  an  Indication  of  an  attitude,  interest,  opinion 
or  judgment  regarding  a  problem,  situation  or  circumstance  related  to  the  goal 
and  objectives. 


Each  Item  la  assumed  to  *\ava  a  property,  symbolised  by  e(l),  and  called  Its 
easiness  (or  difficulty,  1(1)),  which  characterises  the  facility  with  which 
the  most  favorable  responses  to  these  Items  are  evoked  from  persons  In  general. 
Occasionally  an  Item  parameter,  called  the  Index  of  discrimination,  la  used  In 
the  model.  Both  Schmidt  (1969)  and  Wright  (1968)  have  Indicated  that  the  use 
of  this  Index  In  the  model  Is  fatal  to  the  achievement  of  objectivity,  so  we 
omit  its  consideration  In  our  system. 

2 . 3. 3  Responses  -  Free  and  Controlled 

Each  item  has  associated  with  It,  a  single  or  set  of  R^2 
replies,  called  responses,  appropriate  to  the  problem  or  question  posed  by” the 
Item.  It  la  most  efficient  if  these  responses  are  Indicated  by  the  same 
categories  for  each  item.  Achievement  or  ability  measuring  instruments  usually 
have  Binary  Categories  such  ns 


(a)  Correct  and  Incorrect 

(b)  True  and  Falsa 

(c)  Yes  and  Mo 

(d)  Right  and  Wrong 


R-2 

St-2 

R-2 

R-2 


Instruments  measuring  affect  usually  have  either  Folychotomous  or  Binary 
categories  such  aa 


(a)  like,  indifferent,  dislike  ;  R-3 

(b)  strongly  agree,  agree,  undecided,  disagree, 

strongly  disagree  ;  R*5 

(c)  approve,  disapprove  ;  R-2 

(d)  True,  False  ;  R-2 

(e)  yes,  maybe,  no  ;  R-3 

(f)  synonym,  antonym,  neither  ;  R-3 

(g)  satisfactory,  unsatisfactory  ;  R-2. 


Responses  to  some  item  types  are  called  "Free"  if  this  reply  is  uncontrolled 
or  not  constrained  la  any  way  but  may  assume  any  form,  character  or  specificity 
Including  written  or  verbal  phrases,  or  statements,  or  numerical  calculations. 
On  the  contrary,  controlled  responses  imply  a  set  of  categories,  the  rmme  or 
not  for  each  Item,  into  one  of  which  the  natural  reply  of  the  person  to  the 
item  may  be  classified  or  designated  by  the  person  or  the  psychometric lan. 
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The  response  categories  are  often  ordered  and  assisted 
ordertble  on  a  scale  of  correctness,  appropriateness,  suitability,  deaireablllty 
or  soee  other  quality  appropriate  to  the  goals  and  objective.  This  scale  is 
calibrated  and  a  point  for  each  response  category  Is  indicated  by  0(r)<r*,ltR. 

We  assume  that  the  least  deslreable  category  is  r-R  for  which  9(R)*(/.  rhis 
assumption  Is  made  for  both  Binary  and  Polychotomous  Response  models.  Because 
of  the  Importance  of  this  assumption  in  our  subsequent  derivations  we  designate 
it  separately  from  the  text  as 

(1)  9(1)  -  1 

0(R)  -  0. 

In  the  dichotomous  model  the  6(r)  are  considered  to  be  the  observablos  X(n,j)  “ 

0  or  1.  Although  this  couid  also  be  done  for  the  polychotomous  model  where 
X(n,i,r)  represents  the  response  category  related,  we  do  not  choose  to  do  so 
because  we  wish  to  identify  the  responoe  category  parameters  to  be  quantified 
via  analysis. 


Early  analysis  in  psychometrics  routinely  made  the  assumption 
given  by  (1).  Since  metrology  models  were  not  employed  their  only  reason  for 
assuming  (1)  was  "convenience".  There  is,  of  course,  a  more  profound  and  neces¬ 
sary  reason  which  we  shall  discover. 

2.3.4  Person  Parameters 


Each  person  receives  a  "acote"  (to  be  defined  later)  on 
the  instrument  from  which  a  value,  a(n\,  of  the  characteristic  bslng  measured, 
may  be  estlMted.  The  estimation  of  <t(n)  for  person  n  is  usually  ths  single 
purpoac  for  the  psychcmetrlcian  to  administer  the  instrument  to  the  parson. 

The  a(nT  for  some  scores  are  non-estlmetle.  Were  a  person  to  receive  the 
"bent  or  verst"  score  that  it  Is  possible  to  achieve  on  the  Instrument  it  would 
not  be  appropriate  to  estimate  a (n)  in  either  case  because  the  Instrument  le 
cither  two  easy  or  difficult  for  such  persons  and  the  appropriate  o(n)  le  too 
uncertain.  Consequently  we  assush'  chat 

(2)  a(n)  for  "perfeev"  or  "*ero"  scores  in  non-estlmable. 

2,1.5  Paychoawtr ic  Hods  la 

We  have  noted  the  deaireablllty  of  considering  the  parameters, 

(a)  Item  parameters,  c(l);  »  1,1 

(b)  Response  parameters,  0(r);r  “  ),R 

(c)  Person  parameters,  a(n);n  *  1,N 

A  measurement  model  establishes  a  relation  or  relatlona 
amonf  the  responses,  X(n,l,r),  and  the  parameters  of  the  model.  If  r»l,2«R  the 
model  is  called  a  Binary  Meesurement  Model,  INK.  If  R»2  the  model  le  called  e 
Polychotomous  Measurement  Model,  PMM.  Measurement  models  ere  of  varying  degrees 
of  complexity  and  form,  see  Lord  and  Novick  (1998),  depending  upon  the  relations 
assumed  and  the  number  and  nature  of  the  parameters  Involved. 
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2.3.6  Model  Analysis 


The  measurement  oodela  i  om  .he  ber-iy  for  I'isd ir,*  nwti«ate»t 
of  parameter#  desired  to  bo  eetiur  ed  anr*  for  Baking  •  statistic,?  t?*t  of 
agreement  between  the  a  o'  the  response*.  Model  analysis  aweus  estimating 

the  parameters  and  touting  the  ugreec.ant .  The  technique*  for  vmking  analysis  of 
measurement  models  includes  various  statistical  aod  mathematical  procedures  such 
as 


(a)  I**st  squares 

(b>  conditional  and  unconditional  maxivun  Hkelihood 

(c)  simple  statistical  analysis 

(d)  analyst*  of  variance 

(e)  probability  analysis,  and 

(f)  algebraic  analysis. 

III.  THE  POLYCHOTOHOUS  MEASUREMENT  MODEL  (PNM) 

3.1  Odds  and  Ends 

The  odda  the  person  n  will  respond  to  item  i  with  response  r  Is 

defined  to  be 

(3)  O(n.i.r)  -  («(»)c(i)|. {r)  i  -  1,1 

r  *  1,1 

The  sum  of  the  odds  for  that  person  to  make  a  response  other  than,  x  or*  item  i  is 

R 

(4)  0(n, i,r)  -  £0(n, l.k)  H/r. 

k 

The  idea  of  developing  measurement  models  in  probabilistic  terms  via  expresalons 
for  the  odda  of  a  specific  response  is  due  to  Rasch  (i960). 

From  elementary  probaolllty  theory  w*  know  that  if  the  odda  for  the 
occurrence  of  an  event  E  are  a/b,  then  the  probability  of  the  event  E  la 


(5)  P(E)  -a  . 

a+b 

Now  let  K  be  the  event  characterised  in  our  notation  by  (n,i,r)  and 
consider  th*  odda  ratio 


<6> 


a 

b 


Oililiji 

0(n,i,r) 
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As  the  ratio  of  odds  that  person  n  vill  make  response  r  to  item  i 
to  the  odds  that  he  will  aake  some  other  response  to  that  item.  The  probability 
of  this  event  is,  by  (5),  equal  to 


(?)  P(*»,i,r)  -  0(ntltr^  0(Q,l_,rj, 

A  0(n,i, .)  ' 

l  0(n,i,r) 
r 

This  is  a  form  often  assumed  by  other  workers  for  the  PMM.  Vf e  could 
have  started  here  to  satisfy  those  purists  who  do  not  believe  we  are  Justified  in 
applying  (5)  to  (6'  in  order  to  get  (7).  It  is  more  convenient  to  work  with 
probabilities  than  odds  because  the  calculus  of  odds  is  not  as  well  developed  as 
the  Theory  of  Probabilities. 

Note  Chat  our  model  is  unidimensional  in  the  sense  that  we  do  not 
have  a  separate  dimension  of  ability  and  easiness  for  each  responac  category  aa 
was  postulated  by  Anderson  (1973),  and  Vogt  (1971).  Such  a  multivariate  model 
may  be  useful,  upon  occasion,  but  is  difficult  to  interpret  and  use  in  practice 
so  we  do  not  eaqploy  it.  Our  problem  is  difficult  enough  without  such  complica¬ 
tions  and  we  are  content  ot  start  with  this  assumption,  rathet  than,  perhaps, 
icake  it  in  the  end  anyway.  The  multivariate  parameters  may  be  expressed  as 


(8)  (a(n))°^  -  a(£)  and  {c (i)  J0^  '  “  e(£) 


which,  in  our  model  and  notation,  are  expressed  as 


(9)  In  a(|$)  -  t(£)  •  9(k)t(n)  and  In  c(^)  -  0(k)t(i) 


t({) 


Thu  problem  with  our  sirapU f leal  ion  Is  that  It  may  be  too  simple 
u  model  to  adequately  represent  some  sets  of  response  data.  In  that  case  the 
researcher  should  consult  Anderson  (1973)  or  use  another  approach  such  as  that 
provided  by  Bock  (1970). 


3.2  Applying  the  Frequency  Ratio  Method  to  the  PHH 

In  1974  the  author  developed  an  analysis  for  s  simpler  model,  the 
3HM.  This  procedure  is  called  the  frequency  Ratio  Method  because  certain 
probabilities,  analogous  to  (7),  were  estimated  as  objective  probabilities 
using  observed  frequenciee.  We  employ  the  same  approach  here  since  we  believe 
it  to  be  easier,  and  almost  as  accurate,  as  the  estimation  procedures  mentioned 
on  page  5. 


Consider  now  the  four  possible  events  which  can  occur  if  the  same 
per»on  n  responds  to  Items  i  and  j  with  either  response  s  or  response  r.  We 
have,  since  the  responses  are  assumed  Independent,  see  (7), 

(10)  Pf (n, l,s) , (n, j ,s) )  -  0(n,l,8?  ♦  0(n,J,s)  , 

0(n,i,-)  •  0(n, j , • ) 
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(11)  P{(n,l,s),(n,j,r)l  -  0(n,l,8)  »  0(n,j,r)  , 

O(n,i,0 


(12)  P( (n,i,r), (n,j,a) )  -  0(n,l,r)  •  0(n,j,s)  , 

0(n,l, •)  •  0(n,j, •)  ana 


(13)  P((n>l>r)>(n,j>r)I  *  0(n,i,r)  »  0(n,jtr)  * 

0(n, i,  •)  •  0(n, j, •) 

Using  (3)  we  can  express  (11)-(12)  In  parametric  terms  with  (14), 
as  (15)  and  (16). 


(14)  D(1,J,*)  -  0(r.,  1,  •)  •  0(n,j,-) 

(15)  ?((n,t,s),(n,j,r)]  -  (a(n)t(i)]0is)(a(n)c(J)l0(r)/D(l,3,-) 

(16)  P((n,l,r),(n,j,s)J  -  |a(n)e(i)  J6^  [a(n)e(j)  )Q(*)/d(1, j ,  •) . 


It  Is  interesting  and  comforting  to  net*  that  for  th*  BHH  with 
0(s»l)  -  1  and  G(r»4;  •  0  expressions  (15  and  (16)  agree  Identically  with 
equations  (7)  and  (?5)  in  Mocnan  (1974). 

The  conditional  probability  that  person  n  responds  to  item  i  with 
response  s  given  that  he  used  both  responses  a  and  r  for  the  two  items  i  and 
J  is 


(17)  P((n,i,s)  j{(n,ifs)fl(n,),r)H/{(n,t,r)4(n,j,s)})  <■  * 

a  lao 

(18)  Pi  (n,  i  ,r)  |  { (n,  i, s)/l(n,),r )  )U{  (n,i,r>7(n,  J.s)))  -  ^-5^X6)'  * 


Simplifying  (17)  and  (18)  using  (3)  we  note  that  a(n)  cancels 
showing  that  the  PMM  has  the  same  and  usual  property  of  objectivity  that  the 
B>fi  and  other  Reach  models  enjoy.  This  means  that  for  response  data  seta 
that  fit  (7)  the  estimated  parameters  of  c(i)  and  e(r)  are  invariant  estimates, 
except  for  measurement  error,  independently  of  the  populations  from  which  the 
persons  are  obtained,  and  from  which  the  items  were  selected. 


W*  can  rewrite  (15)  and  (17)  using  (3)  aa 
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and  (16)  and  (18)  aa 


1 


(20) 


pijj.  .  .  ,-lf.CijJe(r./LtLijJ9!!j _  .  i>i 

™>  (c(i)]0<8)(e(J)}fi<r)+ft(l>J3^r^Ic(j))e<*) 


Note  that  (19)  and  (20)  are  Independent  of  a(n)  and  their 
denominators  are  identical.  Using  response  data  we  can  estimate  the  proba¬ 
bilities  represented  by  (19)  and  (20;  with  observables.  If  we  take  the  ratio 
of  (19)  and  (20)  we  get  the  considerable  simplification 


(21) 


uj 

p 

& 

ij 

r 

r«, 

llttU 

Cc(J)} 


0(s)-e(r) 

e(a)-0(r) 


3.3  From  Probability  to  Frequency 

l4?t, 

(22)  f (i,s;J,r)  bs  the  observed  frequency  with  which  persons  responded 
to  item  i  with  response  s,  and  to  itea  j  with  response  r. 
Similarly,  we  define  f(i,r;J,a)  and  g(l.j;rte)  -  r(i,s{j,r)  + 
f(i.r;J,a) 

Nov  (19)  ray  be  empirically  estimated  by 

(23)  f(i,*;J.r^i(i.t;r.s). 

Similarly  (20)  is  estimated  by 

(24)  f(<.r;),s)/g(lj;r.s), 

and  lienee  01)  is  slaqily  estimated  by  the  frequency  ratio 

(25)  f(t,s;j.r)/f(i,r;J.s) 

3. 4  Item  and  Kaaponae  Parameter  Estimation 

Given  the  response  data  th-}  frequencies  for  all  pairs  of  items 
and  responses  may  be  essily  tabulated.  From  theme  date  the  frequency  ratio# 


(26) 

for  i>j  *  l,i 

f(i.r;j,e) 

(27) 

Mj.O  * 

f(i,e;J,r) 

for  J>i  “  l,l;a^r-l,t 

and , 

(28) 

R(it 1)  »  Blank. 
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«^tnM  f.  **&*•"&  a  vv*—  -*.  o,  *■  2p*j*&‘r- 


(29) 


(30) 


wty  be  easily  computed.  We  note  that  (26)  estimates 


Ic(i)J^*^  ,  and  (27)  estimates 

^TiT^T 


Mm 

(c(j)J 


0(r)-6(a) 

qTFPooo 


Also  let 


(31)  e(a)-0(r)  -  A(a.r). 

We  now  say  expreaa  the  natural  logarithm  of  (29)  and  (30)  aa 

(32)  A(a,r)lnc(i)  -  A(a,r)lnc(J)  -  inR(t. j) ,J>1,  and 

(33)  -A(a,r)lni:(J)  +  A(#,r)lnc(i)  •  -lnR(l,J),l>J. 

Now  If  we  add  (32)  and  (33)  over  1  for  fixed  j,  after  addins 
and  subtracting  A(s,r)lnc(J)  we  get 


(34) 


-Elnt(l, j)  -  -T(l)  -  1 A(m, r)lnc (1)  -  A(»,r)Elnc(J) 


Ue  now  sake  the  final  assumption  for  our  sethod.  For  definite- 
noas  of  the  t(i)  scale,  we  assuse  that 


l  I 

(35)  UntO)  -  line  ( 1 )  -  0. 

J  ; 

Pros  (34)  we  sty  t»ow  wtitu 

(37)  t(l)  -  -TjJJi  -  A(s,r)lnr(l). 

*  I 

Because  A(s,r)  Is  a  function  of  0(r)  and  0(a)  we  have  difficulty 
In  rat  ?swt  lng  c(l)  explicitly  fros  (37).  Fortunately  we  foresaw  this  dif¬ 
ficulty  and  sadc  assumption  (1).  Setting  *•!  and  r**R,  then  0(a)  •  1,  0(r)  •  0 
and  A(s,r)  *  0{u)  -  0(r)  «■  1-0  •  1.  We  now  write  (37)  aa 

(38)  lnc(l)  •  t(i)  ,  or 


(39) 


c(l)  -  e 


t(l) 


Let  the  c(l)  as  estimated  fraw  response  categories  (l,k),  MR  be 
designated  as  t.&),  then  by  (9)  we  write 


(40) 


t(t>  -  0(k)tf i i .  k/l.R 
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F qua t ion  (40)  enables  us  to  estimate  the  response  parameters 
0(k)  having  first  estimated  the  c(i)  using  data  for  response  categories  1 
and  R.  Since  this  process  may  result  in  considerable  estimation  error  we 
will  use  an  averaging  process  In  practice.  This  will  be  demonstrated  in 
our  examples. 


3.5  Person  Parameter  Estimation 


Our  analvela  hae  shown  the  theoretical  derivation  of  the 
item  parameters  c(l)  and  the  response  parameters  0(r).  We  need  next  to  con¬ 
sider  the  estimation  of  a(n)  for  s  parson.  To  do  this  we  introduce  the  con¬ 
cept  of  s  "score". 

Let  the  vector  of  estimated  response  parameter e  be  denoted  by 
0  and  Its  transpose  by  O',  then 

(41)  0  -  (1,0(2). 0(3) . 0) 

and  let  Y  be  the  "score  vector"  of  integers  y(k)  Indicating  the 
number  of  1  l  tome  responded  to  in  the  kth  response  category  by  the  nth 
person.  Then 

(42)  Y  -  (y(l> ,y(2) . y(k)) ,  where 

(43)  ?  y(k)  -  1 

k 

If  pereon  n  does  not  respond  to  all  1  items  because  of  time  or 
other  reasons,  (43)  will  not  be  strictly  true.  This  does  not  bother  our 
theory,  but  merely  extends  the  computations. 

We  now  define  the  "score"  of  a  person  to  b'' 

(44)  S(n)  -  Y0'. 

This  definition  is  congruent  with  the  usual  definition  of  scores 
in  the  Binary  Model  where  the  score  is  the  number  of  correct  responees.  For 
example,  suppose  I  *  20  and  person  n  got  12  items  correct  and  8  Items  incor¬ 
rect.  Then 

(451  S(n)  -  (12,8) (J)  -  12 

In  binary  scored  tests  the  number  of  possible  scores  is  I+l, 
but  for  polychotomoue  tests  the  number  of  possible  scores,  even  with  (43), 
can  be  very  large.  This  number,  S,  is  the  total  number  of  permutations  of 
each  part  of  the  post  ibis  part  it  Iona  of  1,  among  R  catagorlsa. 

For  example  suppose  1-5  and  R-4,  then  the  partitions  of  5  that 
are  of  interest  when  **4  are  given  below  for  the  parte  of  (46).  In  thia  case 

S  -  441241244412412  -  56 
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0032 
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2021 

*  non-est lasble 


Figure  1.  Possible  score  vectors  to;  the  polychotoaoua 
situation  vith  l» 5,  >*»4. 
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5, 0,0,0  2, 1,1,1 

(46)  *,1,0,0  3, 2,0,0 

3,1. 1,0  2, 2,1,0 

Permuting  each  part  of  each  partition  among  R-4  catagorlaa  we 
get  the  Hat  of  S»56  poaaihle  score  vectors  shown  In  Figure  1.  We  assuaw 
here  that  (43)  holds  otherwise  the  list  Is  considerably  larger. 

In  order  to  estimate  the  a(n)  associated  with  a  given  score 
we  find  the.  a(n)  for  which  the  sue  of  (7),  for  responae  1,  over  the  iteaui 
I  which  equals  the  score,  i.e. 

i 

t 

(47)  |  P(n, i, 1)  -  S(n) . 


Equation  (4?)  Is  solved  for  a(n),  recalling  (7)  and  (3),  by 
Herat  icn  such  as  by  the  Method  of  Mewton-Raphson.  Accordingly,  for  eac.t 
score  vector  of  interest  the  associated  a(n)  nay  be  determined.  Notice  also 
that  In  this  procedure  no  inf roast  ion  regarding  which  items  were  responded 
to  In  thu  sinner  indicated  by  the  score  vector  is  utilised. 

In  order  to  estimate  the  ability,  a(n),  associated  with  the 
score  Sin)  for  our  model  we  write 


(*8) 


l 

i 


I{a(n)t(i) )°'r^ 


S(n) 


as  the  function  upon  which  the  est.mation  of  a(n)  la  to  be  baaed.  Our  prin¬ 
ciple  of  estiawition  is  that  «(n)  is  to  be  derived  by  setting  S(n)  equal  to 
Us  expectation  in  the  model.  This  is  accomplished  by  using  (48). 


Our  problem  is  to  find  that  «(n)  that  maximises  (48).  This 
problem  la  efficiently  solved  by  using  the  Mewton-Raphson  method  of  finding 
a  root  of  an  equation  of  the  form  f(x)  "  0.  To  this  end  we  rewrite  (48)  as 


(49) 


?U> 


r 

i 


JEiiil. 


E{*c(i)  | 


0(‘.) 


S(n) 


0 


/  W*  start  by  guessing  at  the  root,  the  guess  being  x  -  o(n)  •  x,  - 

S(n)/( I-S(n) ) ,  say.  Suppose,  however,  that  the  root  is  actually  *  -Hi.  Then 
by  Taylor's  calculus  expansion  of  f(x),  * 


(50)  f(x  4h)  -  f(x,>  4  Hf '(x  )  4  ... 

1  I 

Our  reasoning  tabes  the  following  lints 

1.  The  error  of  approximation  to  the  real  root  is  h. 

2.  The  error  is  small  enough  to  permit  ue  to  ignore  the  tame 

in  the  Taylor  expansion  in  which  'V  appears  to  s  power  greater  than  the  first. 

3.  We  any  therefore  write,  since  the  actual  root  ie  (X|4fc), 
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(51>  0  *  f(xj)  +  hf'ixj)  approximately. 

The  error  is, 

(52)  h  -  -f approximately. 

The  Iterated  uae  of  thle  method  givee  the  general  formula  for 
the  approximation  to  x  aa 

<5J)  Vi  •  \  -  f^V 

'■<V 

Thl*  iterative  procedure  ia  eaally  programmed  for  digital 
computer*  once  the  derivative  of  f(x)  ha*  been  made  explicit.  Noonan  and 
Potterton  have  done  thia  at  NP3DC.  Thl*  aubroutlne  can  be  uaed  to  develop 
a  tablet  or  individual  value*  of  a(n)  aaaociated  with  acorea  S(n)  given  vectora 
of  valuta  for  the  c(i)  and  0(r)  for  a  particular  maaeuring  Inetrument. 

3.6  Agreement  Between  the  Model  and  the  Response  Pat* 

We  can  nov  assume  that  we  have  response  data  for  which  the  Item 
parameter a  c(l)  and  the  renponae  parameter*  0(r)  have  been  eatimated.  We 
dealre  next  to  uai  theae  eatimatea  to  teat  if  the  model,  repreaented  by  (7), 
agrees  with  the  observed  renponae  data.  Thia  teat  .*.a  some  time*  referred  to 
aa  a  "Coodnea*  of  Fit  Teat".  The  hypothesis  tested  la  that  the  Items  uaed 
evoked  responses  from  Che  persona  in  accordance  with  the  FMH.  To  make  this 
test  we  uae  risher's  (1948)  x2  goodness  of  fit  teat.  To  this  end  consider 

(54)  F(Jp  •  A  awtrlx  of  the  observed  frequencies  f(i,a;J,r)  sad 

f(i,r;j,«),  defined  by  (22),  end  arranged  In  I  rows 
and  I  columns  for  fixed  a  end  r. 

For  every  pair  of  i  and  j  both  (19)  and  (20)  can  be  coaqputwd 
with  the  parametric  eatimatea  r. ( 1  >  and  0(r).  When  these  ere  appropriately 
multiplied  by  g(l,J,r,a)  of  (22)  they  provide  expected  frequencies,  under  the 
model,  with  which  x*(i»J)  may  be  computed  end  tabulated  to  get  e  x*(i)  for 
each  item  with  I-R  degrees  of  freedom.  Thl*  teat  also  serves  aa  a  baala  for 
selecting  items  to  include  or  not  in  the  analysis.  Sea  Noonan  (1974,  p.  13). 

IV.  APPLICATIONS 

4.1  Applications  to  Ability  Testing 

Measurement  of  Abilities  with  the  BMM  is  common  practice  and 
need  not  be  elaborated  upon  here  e: :*pt  to  say  that  in  thia  type  of  testing 
binary  response  categories  are  often  too  restrictive.  It  would  be  better, 
for  neve  purposes,  if  the  response  categories  could  be  extended.  For 
ox**ple  in  perforamnee  testing  the  method  used  to  find  an  answer  nay  be  e a 
important  aa  the  answer  itself.  In  evch  e  case  four  response  categories  may 
be  appropriate  which  we  diagram  as  follows: 
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correct  wet bod 


Incorrect  nethod 


corrai  t  ansuar 

incorrect  answer 

r  -  1 

r  •  2  or  3 

r  -  2  or  3 

r  -  4 

Whether  the  upper  right  ceil  or  the  lover  left  cell  corresponds 
to  r  •  2  or  r  •  J  My  be  date  dependent. 

4.2  Application!  to  Attitude  EatlMtion 

An  attitude  instrument  la  often  dealgsed  to  appralae  a  peraon'a 
favorablenoaa  toward  som  institution,  group  or  concept.  Other  t/pea  Manure 
lntereata  and  activations.  Often  the  reeponee  categorlea  are  poly^hotoiaoua 
and  consist  of  preecaled  atateaonta  indicating  degteea  of  alfilfltance  or  a 
atatua  condition.  Thaae  devleeit  seen  no  it  naturally  to  be  analysed  with  a  PMM. 

4 . 3  App Heat  lone  to  Survey!  and  Queatloonairee 

If  ail  or  part  of  uurvey  or  eueatioonalre  iostrunanta  eeeka  to 
Maaure  a  un id invasions l  prooerty  of  pereona  and  the  reeponaea  can  be  trans¬ 
formed  or  converted  to  the  eeM  eet  for  each  question*  than  the  PMH  le  applica¬ 
ble  and  the  enelyale  given  here  ie  appropriate.  An  exaaple  of  this  is  *  aurvey 
lnatrunent  daalgned  to  aaaeaa  the  quality  of  houeee  for  the  purpose  of  establish¬ 
ing  e  value  for  sale  or  fot  insurance.  It  aeeaia  better  to  establish  the  uniform 
response  categories  a  priori  rather  than  go  through  a  re-coding  process. 

V.  EXAMPLE 

5.1  A  Simulation  Example 

We  have  not  yet  completed  a  computer  program  that  carries  out  the 
entire  7MK  analysis.  We  expect  to  do  this  in  the  future.  In  the  Man t  1m  we 
show  our  procedure  by  Mane  of  a  very  detailed  exaaple  using  s Inula ted  data. 

We  organise  the  computations  in  five  phases,  as  toiiowa: 


Phase 

I 

Oats  Tabulations 

11 

c(i)  IwtlMtico 

111 

0(r)  Eat  1m t loo 

IV 

Goodnaas  of  Pit 

I 

a(n)  Cat  1m t loo 

Each  phase  involves  several  steps. 

5.1.1  The  Situation 

Although  real  date  for  the  IfM  ere  plentiful  we  chose 
to  slwulate  tha  set  we  use  hart  because,  in  doing  so,  we  would  then  know  thr 
true  values  of  the  parameters  we  try  to  eetlMte.  The  advantages  of  this 
practice  are  obvious. 
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Our  simulation  involved  generating  R  *  4  choice  responses 
to  each  of  \  *  5  items  for  N  »  1000  persons.  The  Item  parameters  were 

c(l)  -  2.50  c(4)  *»  .50 

(56)  c(2)  *  2  00  c(5)  -  .286 

i(3)  -  1.00 

Hie  response  parameters  selected  wev* 

,,7.  0(1)  -  1.000  0(3)  *  -.3162 

0{2)  -  .3162  6(4)  -  .0000 

The  alpha  paramotern  were  obtained  by  adding  |*|  to  a 
fixed  constant  for  different  persons,  r  1«  a  random  selection  from  a  norms 1 
distribution  N(0,l)  and  the  constant  was  regularly  modified  so  aa  to  simulate 
persons  with  different  levels  of  ability. 

Thus  the  sitmlatioo  program  could  compute  R»4  odds  according 
to  (3)  and  hence  R  *  4  prc'sbillt lea  a-cordtng  to  (?).  These  probabilities 
could  then  be  accumulate*!:  n.v.i  a  unif->r»  .andom  number  was  generated  and  compared 
with  the  accumulated  respoate  probabli. J ties.  From  this  information  a  response 
waa  dcaigrated.  This  work  was  repeated  I  *  i  tines  for  the  fcame  person  and 
X  *  1000  times  for  all  peraonc  In  the  sample.  The  response  data  and  other 
information  were  punched  on  cards. 

5.1.2  Data  tabulation 

Assuming  that  the  response-  have  be?>j  recorded  on  optical 
nark  sense  forms  or  have  been  placed  on  Hollerith  cards  wc  generate  the 
frequencies  In  tabular  form,T(V?>*  shown  in  Figure  2.  Notice  the  array  of 
frequencies  given  in  Figure  2  is  doubly  symmetrical.  The  array  consists  of 
blocks  of  responses,  8(1,})  to  pairs  of  items  i  and  j.  Within  each  block  are 
the  frequencies  of  response  t/pt*  to  each  Item  pair,  ror  exempts,  63  persons 
responded  in  the  second  category  to  the  first  i:ea*  and  in  the  third  category  to 
the  third  item,  etc.  The  total  popularly  of  a  response  category  is  indicated 
in  blocks  B(1,L).  For  example,  the  third  category  of  the  first  item  is 
relatively  unpopular. 

The  matrix  T{ j j )  1 »  not  required  to  be  printed  but  its 
•'(instruct  ton  is  required  for  subsequent  phases  of  our  analysis.  The  construc¬ 
tion  of  this  table  concludes  the-  Data  Tabulation  Phase. 


5.1  3  c(i)  Batimatlt-n 


PClj),  from  T(f|). 


Corresponding  to  (22)  for  s  »  1  snd  r 
This  is  «-flsily  done  and  produces 


4  we  construct  (54), 


(58) 
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- 
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According  to  (26)  we  next  find  the  frequency  ratio*  in  R(i,j) 


(59) 


R(i.j) 


>4j 

1 

2 

3 

4 

5 

1 

1.694 

3.073 

9.056 

12.786 

2 

.590 

- 

2.182 

5.000 

5.318 

m 

3 

.325 

.458 

- 

2.186 

3.231 

4 

.110 

.200 

.457 

- 

1.571 

5 

.028 

.188 

.310 

.636 

- 

The  *kew-syMK.trlc  natrJx  reaul ting  from  taking  logarithms 
of  R(i,J)  by  (32)  and  (33)  ia 


(60) 


(61) 


1 

2 

3 

4 

5 

1 

- 

.527 

1.124 

2.203 

2.548 

„  2 

-.527 

- 

.780 

1.609 

1.671 

3 

1. 124 

-.780 

- 

.782 

1.173 

4 

-2.20.3 

-1.609 

-.782 

- 

.492 

5 

-2.548 

•1.671 

-1.173 

-.492 

- 

Then  according  to  (36) c  (37)  and  (39) 


-T(i): 

6.402 

3.533 

.051 

-4.142 

-5.844 

t(i): 

1.280 

.707 

.010 

•  .828 

-1.169 

£(i): 

3.598 

2.027 

.990 

.437 

.311 

I toe  Parameters: 

3.500 

2.000 

1.000 

.500 

.286 

Notice  that  the  ltew  parameters  and  their  estimates  axe  in 
good  agreement .  This  concludes  the  second  phase  of  c(i)  eat  last  ion. 

5.1,4  0(r)  Eat  last  ion 

Our  purpose  now  la  to  eatlamte  the  0(r)  from  (40).  To  do 
this  w*  use  the  procedures  of  section  5.1.3  to  estimate  the  t(f)  k  -  1,4.  This 
work  has  been  completed  and  is  summarised  below: 


Summary  of  the  t(j)  calculation*  for  k  »  a  •  2,3  and  r  *  4 


k*2 

k-3 

a 

b 

.342 

-.330 

.2672 

-.2578 

.232 

c* 

♦ 

1 

.3281 

-.1598 

.073 

.014 

7.3000 

1.4000 

-.284 

.103 

.34  30 

-.1244 

-.362 

.324 

.3097 

-.2772 

According  to  (40)  we  estimate  0(2)  and  0(3)  by  dividing 
column  2  and  3  entrlea  above  by  the  t(l)  valuta  obtained  at  the  end  of  ae.cr.lon 
5.1.3.  Thia  divialon  give*  columns  a  and  b  above.  According  to  our  simulation 
column*  a  and  b  above  ahov,  embarrassingly,  considerable _var lance  fro*  the** 
values.  Coluan  average*,  omitting  i  *  3  estimates  give  6(2)  »  .3120  and 
3(3)  *-0.2048.  Using  these  values  the  eatiaatea  of  the  response  parameters  are: 


1.000 

.312 


0(3)  -  -.205 
0(4)  -  .000 


Where  agreeaent  with  (5/),  the  paraaetera,  is  nothing  to 
write  boat  about  even  considering  their  questionable  estivation  procedure.  This 
concludes  the  0(r)  estimation  phase. 

5.1.5  Goodness  of  Fit  Test 

Our  purpose  is  to  use  (19)  and  our  estisutes  of  c(i)  and 
6(r)  to  compute  the  expected  frequencies  corresponding  to  (51).  First,  however, 
we  obtain  G(l*)  as 


1 

2 

3 

4 

5 

1 

- 

132 

167 

181 

193 

2 

132 

- 

140 

162 

139 

»  3 

167 

140 

- 

137 

110 

4 

181 

162 

137 

- 

90 

5 

193 

i39 

110 

90 

- 

w‘u> 


For  i>J  use  (20)  for  the  lover  segment  of  (65)  and  for  j>i  us*  (19)  for  the 
upper  segment  of  (65). 
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I 


2 


3 


12  .891?  .9204 

19  .8220  .8670 

.6938  .7610 


.5842 


.4158 


(65)  element -wine  by  (64)  we  get  a  aatrix  of  exacted  frequencies 


1 

- 

84.43 

130.96 

161.40 

177.64 

2 

47.57 

- 

94.07 

133.26 

120.51 

3 

36.04 

45.93 

- 

95.05 

83.71 

4 

19.60 

28.74 

41.95 

- 

52.58 

5 

15.36 

18.49 

26.29 

37.42 

- 

We  calculate  x2(i.J)  using  (58)  and  (66)  with  (67) 
(6?)  X2(i.J)  -  (P(i.J)  -  E(i,J)l2/E(i,J). 


This  gives 


_JL _ 

2 

3 

4 

5 

- 

.0242 

.1879 

.0157 

.0104 

04  30 

- 

.0396 

.0227 

.1022 

6826 

.0811 

- 

.0116 

.0010 

1306 

.1053 

.0263 

- 

.1114 

0255 

.6663 

.0032 

.1565 

— 

We  next  add  the  previous  table  syesetrically  to  give  (69)  and  then  add  across 
the  cols  of  (69)  to  give  x2(i)* 


I 


The  row  total*  x?(i)  with  l-R  degree*  of  freedom  and 
measure  the  fit  of  each  item  to  the  »»del  (7).  The  critical  point  for 
d.f.  “  5-4  »  1  and  a  •  .05  ia  3.841.  Hone  of  the  x2(l)  sven  approach  thia 
value,  as  expected,  tinea  the  data  were  generated  from  the  model's  equation,  (7). 
This  concludes  the  goodness  of  fit  phase. 


example. 


5.1.6  u(n)  Estimation 

See  Section  3.5.  This  concludes  the  analysis  for  this 


5.2  A  Questionnaire  Exaaple 


Following  we  provide  a  description  of  an  interesting  and  useful 
situation  tc  which  the  Pttt  and  its  analysis  could  be  applied,  but  cannot  be 
because  the  response  dsta  are  hopelessly  unobtainable  from  the  NPKDC  Magnetic 
Tape  library. 


H.  McDowell  (1972)  developed  a  series  of  quest lounalres  for  each 
of  several  Maval  ratings.  The  items  of  these  forme  were  statements  of  tasks 
which  must  be  performed  by  Naval  Personnel  on  active  duty  who  were  employed  in 
the  rating.  There  were  four  responees  for  each  task  which  indicated  how 
Independently,  if  at  all,  each  task  was  performed  by  the  Naval  person.  The 
respouses  to  each  task-item  were: 


s.  I  supervise  the  performance  of  this  task, 
h.  I  perform  this  task. 

c.  I  assist  in  the  performance  of  this  task. 


d.  This  task  is  not  performed  in  my  duties. 


It  seams  to  this  author  that  the  variable  which  these  questionnaires 
cculd  be  thought  to  measure  is  "task 'performance  competency."  For  data  which 
fit  the  model  a  measure  of  this  variable  could  be  constructed  and  applied  to 
Naval  personnel  who  are  actively,  or  not,  engaged  in  the  measured  rating.  These 
measures  would  obviously  be  useful  for  advancement,  assignment  and  re-assignment 
decisions. 

Although  the  survey  dsta  were  useful  and  informative  for  other 
purposes,  they  could  have  been  of  even  greater  benefit  to  the  naval  Personnel 
System. 

VI .  SUMMARY 

We  have  accomplished  what  was  intended  as  stated  in  the  introduction. 
Specifically,  we  defined  the  responses,  relevant  parameters  and  measurement 
situation  related  to  a  model  called  the  Polychotomous  Measurement  Model.  Both 
the  theoretical  and  numerical  analyses  of  the  model  were  carried  out  in  detail. 
Background  material  to,  and  applications  of  the  model  were  provided. 


VII. 
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INTRODUCTION 

To  develop  a  force  structure  or  even  a  small  organizational  structure 
there  is  ordinarily  a  need  to  align  duties  by  grade.  Thle  Is  a  simple 
matter  of  distinguishing  responsibility  and  providing  the  beat  gradation 
of  skill  and  expected  performance.  As  the  Department  of  Army  faces 
changing  personnel  requirements  l.i  organizational  design  for  particular 
missions  and  units  and  for  comprehensive  restructuring  such  as  the  Enlisted 
Personnel  Manageswnt  System  (EPHS)  changes,  there  must  be  available  a 
reliable  and  valid  methodology  to  determine  objective  standards  of  grade 
authorization. 

Since  the  pay  grade  allocation  for  enlisted  Jobs  (E-l  through  E-9) 
generally  carries  Increasing  levels  of  responsibility  and  opportunity,  the 
job  requirement  factors  require  accurate  estimation  and  scaling  to  develop 
pertinent  grade  standards.  Job  factors  such  as  Knowledge  and  Combat 
F.xposure  are  rated  to  Identify  the  level  of  the  factor  needed  to  expect  a 
Job  will  have  sufficient  skill  and  authority  Invested  to  get  the  desired 
productivity. 

The  DA  personnel  analysts  and  various  MOS  Job  proponents  and  major 
commanders  have  to  have  a  methodology  which  can  dependably  identify  grade 
standards  and  which  will  relate  to  the  total  enlisted  Job  structure.  This 
methodology  has  to  compute  grade  optimally,  but  occasionally  even  must  work 
when  grade  is  constrained  to  accommodate  curtailed  grade  allocation  to 
satisfy  grade  ceilings. 


The  views  expressed  in  this  paper  are  those  of  the  author  and  do  not 
necessarily  reflect  the  views  of  the  Army  Research  Institute,  the 
Military  Personnel  Center,  or  the  Department  of  the  Army. 
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because  grade  determination  follows  a  given  administrative  and 
psychometric  procedure,  the  methodology  becomes  a  combination  of  judgmental 
and  rating  behaviors.  With  the  grade  assignment  process  managed  at  the  DA 
level  and  recommendations  for  new  positions  or  grade  change?  coming  into 
the  designated  action  personnel,  there  is  quite  often  a  tendency  for  the 
requesting  grade  adjustment  to  appear  out-of-balance  on  at  least  one  job 
factor  or  grade  standard.  The  grade  estimate  then  must  translate  what 
appears  at  times  to  be  rather  subjective  judgment  into  objective  and  pre¬ 
dictable  grade. 

For  many  years  the  grading  of  jobs  or  positions  could  rely  on  tradi¬ 
tional  military  structure  and  in  most  cases  this  reference  to  experience 
and  known  mission  requirements  proved  sufficiently  adequate.  From  World 
War  II  the  Army  began  carefu' ly  documenting  its  techniques,  policies,  and 
guidelines  for  assigning  grade  to  enlisted  jobs  (Hadley,  1961).  It  became 
more  and  more  apparent  that  an  approach  that  was  quantifiable  and  statis¬ 
tically  defensible  would  have  to  form  the  foundation  of  the  Army  job 
evaluation  methodology.  Such  methodology  would  help  in  distinguishing  the 
job  requlr— vsntj  most  equitably  while  supporting  the  best  allocation  of  grade 
for  Job  types  and  functions  to  permi.  better  organizational  and  grade  balance 
projections  for  force  and  budgetary  planning.  At  this  point  past  approaches 
were  reviewed,  and  a  concerted  effort  was  directed  to  explore  the  predictive 
value  of  possible  enlisted  multiple  regression  equations  along  lines  already 
pursued  experimentally  for  Air  Force  officer  positions  (Chrlstal,  1965). 

In  1966-6)  the  traditional  and  accepted  job  factors  were  used  to  develop 
several  mutlple  regression  equations  to  explain  how  enlisted  grade  could  be 
derived  systematically  and  to  explore  the  policy  agreement  of  selected 
raters  evaluating  job  factors  and  the  associated  grade  (Anderson,  Corts, 
and  Waldkoetter,  1967).  This  effort  was  modestly  successful  and  provided 
a  pragmatic  beginning  for  enlisted  job  evaluation  in  that  the  sample  of  jobs 
rated  and  the  policy-board  grades  assigned  only  re-rinded  t^  a  request  for  an 
equation  model.  Treating  any  implications  for  those  situations  arising 
with  application  of  an  equation,  which  is  then  tested  against  other  policy 
constraints,  was  simply  avoided. 

After  nearly  eight  years  and  numerous  personnel  policy  adjustments 
leading  to  doctrine,  training  and  force  structure  changes,  the  question 
about  equation  policy  Implications  was  raised  in  the  form  of  s  research 
need  In  accordance  with  the  Army  Regulation  70-8  (1975).  In  1975  a  few 
key  policy  changes  regarding  the  F.PMB,  the  objective  force  grade  constraints, 
and  the  career  force  development,  led  to  the  situation  of  asking  whether  the 
Job  factors,  the  related  formula  weights,  and  Implications  of  any  equation 
usage  were  really  functioning  In  the  same  original  context  of  1*>67,  Knowing 
that  grade  is  compensation  for  money  and  work  Identity  and  that  economic 
indexes  had  shifted  significantly  in  the  post-Vietnam  era,  it  was  believed 
important  to  reassess  the  basis  for  the  1967  equation  and  account  for  any 
noticeable  reactions  to  changing  standards  for  grade  determination. 
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METHOD 


A  sample  of  200  enlisted  benchmark  job*  or  duty  positions  was  selected. 

A  group  of  military  personnel  analysts  slightly  modified  the  sample  content 
toward  more  combat  arms  positions  to  retain  a  concept  that  any  job  gracing 
procedure  should  have  an  orientation  lending  more  Influence  to  the  basic 
mission  of  the  force.  These  analysts  were  aware  that  the  initial  sample 
was  constructed  uy  a  stratified  random  plan  of  selection.  Once  the  sample 
was  agreed  on  by  the  analysts  from  the  U.S.  Army  Military  Personnel  Center 
preparations  were  made  by  the  same  analysts  to  write  the  200  duty  position 
job  descriptions,  which  were  supported  using  a  bank  of  CODAP  duty  position 
descriptions  and  others  maintained  in  their  usual  processing  of  job  actions. 

Concurrently,  the  10  job  factors  used  for  the  1967  equation  were  re¬ 
viewed  to  see  that  the  definitions  and  scale  values  from  one  through  six 
were  still  acceptable  for  rating  purposes,  with  two  experimental  job 
evaluation  or  requirement  scales  being  newly  defined.  The  scales  of  Job 
Satisfaction  and  Organisational  Setting  were  designed  to  assist  in  getting 
at  evaluation  actions  where  a  Job's  worth  is  difficult  to  estimate:  first, 
because  job  satisfaction  within  a  job  may  exist  as  an  inherent  grade  bias 
toward  higher  gradr  or  satisfaction  decreases  with  grade  when  grade  should 
offer  a  career  attraction;  and  secondly,  because  organisational  setting  is 
positively  weighted  regardless  of  the  type  of  grade  or  job  evaluation 
action,  may  contribute  too  much  toward  some  grade  allocation,  ond  had  not 
been  quantitatively  analysed.  The  research  design  then  called  for  using 
12  Job  factor  scales  as  predictors  of  grades  E-3  through  E-9,  the  scalei 
are:  (1)  Knowledge;  (2)  Supervision;  (5)  Concentration  and  Attention; 

(4)  Freedom  of  Action;  (5)  Physical  Effort;  (6)  Combat  Exposure;  (7)  Adapt¬ 
ability  and  Resource fulness;  (8)  Responsibility  for  Material  Resources; 

(9)  Physical  Skills;  (10)  Job  Conditions;  (11)  Job  Satiofaction;  (12)  Or¬ 
ganisational  Setting. 

Three  groups  of  policy  boards  were  Identified  to  estimate  grade  for  the 
duty  positions  and  were  comprised  of  50  MILPERCEN  officers,  75  Fort  Harrison 
officers,  and  75  Fort  Bliss  NCO's.  The  three  boards  also  independently  rated 
12  job  factors.  Policy  board  officers  and  NCO's  each  rated  20  duty  position 
descriptions  with  all  200  benchmark  job  descriptions  rated  in  each  group  by 
means  of  10  stratified  sets  of  20  randomised  job  descriptions.  Thus  several 
independent  situations  were  arranged  with  equivalent  rating  scenarios  to 
determine  the  degree  job  factors  were  required  to  perform  the  duties  foi 
each  job  and  the  most  equitable  grade.  The  choice  of  three  groups  provided 
for  several  comparative  criterion  policy  b  tilding  steps.  Also  the  range  of 
rating  behavior  would  likely  vary  sufficiently  to  indicate  to  some  extent 
if  the  prediction  of  grade  would  stabilize  enough  to  warrant  use  of  any 
given  equation  within  an  operational  set  of  limitations  related  to  the  career 
8‘ructure,  work  design,  and  duty  position  Identity.  The  job  sample  size  of 
200  benchmark  positions  and  selection  of  policy  groups  were  believed  adequate 
to  obtain  sufficient  prediction  of  grade  that  reasonable  job  factor  reliability 
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would  result  and  a  grading  equation  would  function  quite  credibly  through  the 
different  levels  of  personnel  review  and  accounting. 

Several  policy  choices  were  developed  exploring  what  would  happen  when 
the  200  officers  and  NCO’s  and  combinations  thereof  furnished  job  factor 
ratings  and  grade  judgments.  The  MILFERC2N  and  Port  Harrison  officer  factor 
ratings  were  used  to  predict  criterion  grade  ratings  of  tho  Fort  Bliss  NCO’s. 
Then  the  NCO  factor  ratings  were  combined  separately  with  each  officer  group 
to  predict  each  set  of  officer  criterion  grade  ratings.  This  would  allow 
for  a  range  of  possible  factor  ratings  to  observe  the  effects  on  grade  pre¬ 
diction  while  identifying  the  combination  yielding  the  highest  multiple 
correlation  coefficient.  Such  an  approach  also  gave  a  notion  of  equation 
differences  which  might  arise  ae  factor  weights  are  varied  and  criterion 
grade  ratings  are  alternately  substituted.  Although  the  validation  of  the 
best  equation  rests  on  comparison  of  the  three  equations,  a  certain  funda¬ 
mental  stability  of  the  highest  multiple  correlation  coefficient  was  being 
confirmed  by  obtaining  correlations  which  would  show  they  were  relatively 
close  together.  In  this  Instance  it  is  surely  a  stringent  alternative 
procedure  for  cross-validation  because  there  are  deliberate  variations 
introduced  rather  than  randomly  dividing  the  groups  into  two  subsaaples  and 
having  the  same  criterion  grade  judgments  used  to  compare  equations. 

Policy  choices  further  considered  that  equations  represented  different 
perceptions  of  what  might  serve  to  determine  grade.  Should  a  mix  of  officer 
and  enlisted  job  factor  ratings  offer  a  more  desirable  grade  prediction? 

Or  would  the  use  of  officer  factor  ratings  best  predict  enlisted  criterion 
grade  ratings?  Then,  too,  could  fewer  factors  predict  grade  almost  as 
accurately  as  all  factors?  These  questions  occur  since  they  can  influence 
how  job  grading  is  done  and  whether  particular  values  arc  consistent  In  the 
grading  process. 

Besides  trying  to  derive  the  optimal  grading  equation,  there  was  r 
persistent  question  about  the  vulnerability  of  a  job  to  a  downgrade  action 
if  a  requirement  is  directed  to  lower  grade  or  hold  to  given  grade  constraints 
to  keep  the  force  within  specified  boundaries.  The  process  of  grading  jobs 
poses  a  series  of  conflicting  requirements  at  times  to  provide  the  most 
equitable  grade  for  the  job  and  to  hold  the  Job  structure  to  authorized  force 
levels  or  budgetary  costs.  Any  policy  should  develop  according  to  specified 
steps.  More  often  than  not  the  best  grade  prediction  should  act  as  a  solid 
reference  point  and  come  from  an  independent  rating  process,  then  respond 
to  the  limits  imposed  by  personnel  structure  and  grade  quotas.  A  question¬ 
naire  was  administered  to  check  on  the  rating  effects  of  rater  opinions  and 
what  might  help  in  the  hopefully  objective  process  of  job  grading  and  pay 
allocation  systems. 


RESULTS 

After  the  200  benchmark  job  descriptions  were  rated  on  the  12  Job  require¬ 
ment  factors  with  proposed  grade,  the  data  were  reduced  to  have  estimates  of 
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factor  reliability  and  stepwise  multiple  correlations  with  factor  validity 
coefficients  confuted.  The  reliability  estimates  using  analysis  for  variance 
(Winer,  19o2)  were  computed  for  each  job  factor  to  assure  that  an  adequate 
level  of  leliability  was  secured  in  the  collected  ratings,  Reliability 
coefficients  ranged  from  .87  to  .98  with  most  falling  in  the  middle  ninety 
range,  the  reliability  coefficient  for  the  mean  grade  ratings  across 
sampled  jobs  was  .98.  For  the  experimental  scales.  Job  Satisfaction  and 
Organizational  Setting,  the  reliabilities  were  .93  and  .96,  respectively. 
Validity  coefficients  for  Job  factor  ratings  correlated  with  the  criterion 
of  proposed  grad1*  for  the  combined  groups  were  generally  high  to  very  high 
ranging  from  .56  to  .90,  but  across  the  three  combined  groups  of  policy 
board  raters  four  factors  had  negative  or  extremely  low  validity.  These 
were  Physical  Effort,  Combat  Exposure,  Physical  Skills,  and  Job  Conditions. 
Both  Job  Satisfaction  and  Organizational  Getting  Indicated  high  validity 
(.82  and  .88). 

Multiple  regression  (8)  equations  were  derived  for  three  combined 
groupings:  HILPERCEN  and  Harrison  with  Bliss  grade  criterion;  KILPERCEN 
and  Bliss  with  Harrison  criterion;  and  Harrison  and  Bliss  with  KILPERCEN 
criterion.  The  multiple  R's  were  derived  producing  .94  using  the  Bliss 
criterion  grades,  .93  using  the  Harrison  criterion,  and  .92  using  the 
KILPERCEN  criterion.  Although  the  regression  coefficients  used  to  multiply 
each  factor  weight  varied  to  a  greater  or  lesser  extent  in  arriving  at  the 
multiple  R  t-n  each  combined  group,  the  end  result  was  a  highly  predictive 
equation,  lhe  equation.-  almost  uniformly  predicted  current  or  actual  grade 
given  for  the  200  Job  descriptions  at  .83.  Perhnpu  rather  oddly  the  1967 
equation  predicted  actual  grade  then  with  a  multiple  R  of  .83.  Super¬ 
ficially,  none  of  the  three  equations  predicting  optimal  grade  seems  to 
give  much  advantage  In  predicting  current  grade,  yet  an  observation  is 
suggested  that  the  intervening  10  years  and  use  of  the  1967  grade  equation 
under  personnel  management  constraints  lid  not  materially  distort  the 
grade  allocation  order. 

Fince  the  multiple  R  of  .94  was  derived  from  the  KILPERCEN  and  Fort 
Harrison  officers  factors  ratings  correlated  with  the  Fort  Bliss  SCO's 
criterion  grade  ratings,  it  was  tentatively  agreed  that  this  equation 
would  provide  more  probable  credibility.  The  officer  factor  ratings  sug¬ 
gest  that  officers  must  capably  direct  job  requirement  standards,  while 
the  SCO's  should  know  most  competently  what  an  enlisted  job  is  comparatively 
worth  and  must  obtain  corresponding  productivity.  More  analyses  are  being 
performed  to  make  sure  this  equation  ha*  ail  of  the  statistical  character¬ 
istics  to  recommend  a  prototype  application  for  a  series  of  job  grading 
requests.  There  have  already  been  some  trial  data  collected  which  give 
a  trend  of  lower  grade  estimates  than  obtained  with  the  1967  equation, 
but  the  sample  of  jobs  ts  not  representative  and  the  personnel  analysts 
were  not  too  experienced  using  the  new  equation.  As  it  si  uids  the  pro¬ 
posed  constant  value  and  weights  for  mean  factor  ratings  in  the  new 
equation  are: 
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1.34  3  +  0.541XI  +  0.233XXI  +  (-0. 353)XXII  +  0.258)^  +  (-0.087)3^  +  0.039^ 

+0.398X  +  0.093X  +  (-0.016) X  +  G.046X  +  0.013X  +  0.132X 

VII  VIII  IX  X  13  X£I 

*  computed  grade. 

When  the  first  (lowest)  level  for  all  12  job  factors  (i.e.,  X  ,  X  . ..X  ) 

is  entered  into  this  equation,  the  result  is  2.63.  This  value  would  pro¬ 
bably  round  to  E-3.  A  shortened  version  of  the  fuctor  equation  in  the 
stepwise  regression  gave  the  constant  value  and  positive  weights  of: 

1.298  +  0.4083^  +  0.2521^  +  0.242^  +  0.254XylI  +  0.1871^^  -  computed  grade. 

When  the  first  (lowest)  level  for  all  five  factors  is  entered  into  this  equa¬ 
tion  the  result  is  2.64.  This  equation  is  a  practical  substitute  when  a 
quick  estimate  is  asked  for  „nd  the  12  factor  equation  is  applied  later  for 
review  purposes. 

The  downgrading  equation,  or  method  by  which  downgrading  could  occur, 
examined  the  policy  of  raters  who  would  perform  downgrading  of  an  Available 
sample  of  enlisted  positions.  Die  rating  groups  were  directed  to  reduce  10 
of  20  job  descriptions  by  one  grade.  The  equation,  then,  predicted  if  a  job 
would  have  the  factors  for  downgrading.  Significant  agreement  ssemed  related 
to  three  job  factors  with  a  negative  value  for  Organisational  Setting  (X*I) , 
and  pojltivc,  counter-balancing  values  for  Combat  Exposure  (VI)  and  Responsi¬ 
bility  for  Marerial  Resources  (VIII).  The  equation  having  a  large  constant 

of  9.38,  then  had  the  following  weights  -1.30"X  +  .  7tfX  +  .80T( 

XU  VI  VI X I 

derived  which  are  used  basically  to  give  an  estimate  for  probability  of  a 
"non-downgrading"  action.  Tht  objective  of  predicting  10  downgrading  deci¬ 
sions  by  20  raters,  led  to  the  computation  of  a  low-value  index  by  dividing 
the  above  equation  solution  Hy  20  to  determine  the  non-downgrading  proba¬ 
bility,  with  the  probability  of  downgrading  becoming  greater  as  the  numerator 
becomes  smaller.  This  approach  only  illustrates  how  downgrading  may  have 
systematic  formulation  to  guide  such  policy  usage  and  control.  Pragmatically, 
it  was  found  the  factor  weights  could  be  substituted  in  the  12  factor  equation 
and  the  result  was  a  realistic  operational  estimate  of  whether  the  Job  might 
appear  vulnerable  to  downgrading.  An  example  of  the  projected  use  of  the 
downgrading  equation  would  result  in  the  probability  of  downgrading  a  posi¬ 
tion  like  Reenlistment  NCO  but  the  Job  of  Rifleman  would  retain  the  assigned 
grade. 

A  job  raring  questionnaire  (,’RQ)  was  given  to  all  members  of  the  three 
policy  rating  boards.  This  questionnaire  was  used  to  obtain  possibly  relevant 
Information  connected  with  downgrading  and  rating  behavior,  interpretation  of 
grade  prediction  equations  and  certain  biodsta.  The  interpretive  sections  of 
the  questionnaire  explored  three  areas  which  could  systematically  influence 
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policy  outcomes:  job  face'  mevhodology  and  procedures;  career  management 
considerations;  and  economic  attitudes.  The  results  suggest  that  the  job 
factors  were  adequately  defined  for  the  rating  task  and  that  the  raters 
brought  their  own  judgmenc  to  bear  with  systematic  grading  standards. 
Responses  indicated  the  majority  of  judges  found  the  technical  areas  and 
special  skill  positions  most  difficult  to  evaluate.  This  finding  may  suggest 
that  a  vertical  grade  structure  is  not  the  best  way  to  reward  technical-skill 
performance,  and  that  level  of  authority  is  hard  to  balance  with  amount  of 
specialization.  An  implication  may  encourage  further  research  toward  the 
development  of  a  skill  progression  and  reward  system  separate  from  grade 
progression.  "n  terms  of  career  management  a  variety  of  considerations 
were  dealt  with  relevant  for  promotion  and  regrading  actions.  It  was 
observed  that  nearly  80  percent  of  rater-judges  believed  that  current 
policies  have  resulted  in  the  grade  structure  end  promotions  being  more 
competitive  than  five  years  ago.  There  was  fundamental  consensus  among 
the  judges  concerning  retirement  promotion  jith  66  percent  indicating 
promotion  should  be  contingent  upon  evidence  of  at  least  one  or  more  years 
of  active  duty  remaining  before  retirement.  Given  a  choice  between  down¬ 
grading  some  Jobe  or  holding  grade  for  all  jobs  in  order  to  meet  budgetary 
constraints,  26  percent  of  the  judges  chose  the  former  and  36  percent  chose 
the  latter  with  the  others  voting  to  analyze  the  problem  from  a  different 
approach.  The  results  may  rend  to  suggest  that  interfering  with  career 
progression  is  nor  viewed  positively  to  satisfy  budgetary  constraints.  In 
exploring  the  economic  attitudes  of  the  judges,  66  percent  indicated  they 
believed  their  grade  estimates  provide  the  income  and  recognition  required 
to  assure  adequate  Job  performance. 


Further,  it  was  noted  the  promotion  concept  is  not  easily  compensated 
for  and  any  approach  to  offer  other  types  of  rewards  and  recognition  will 
somehow  have  to  be  related  'o  some  concept  of  continued  advancement  cr 
caieer  progression.  Another  observation  occurred  to  imply  that  any  personnel- 
training  actions  tied  to  promotion  delay  or  substitution  will  not  be  readily 
accepted  unless  promotion  is  treated  as  a  separate  transaction  and  any  cost 
problem  is  adjusted  to  show  it  is  not  managed  to  alfcct  promotion.  The 
background  information  (biodata)  qualified  the  police  raters  as  having  75 
percent  with  10  or  more  years  of  service,  working  in  s  wide  cross-section  of 
units,  the  majority  having  three  years  or  less  time-in-grade,  87  percent 
educated  beyond  the  high  school  level,  and  94  percent  having  training  for 
their  military  duties  through  resident  courses  and  OJT. 


DISCUSSION 


The  Army  Research  Institute  in  conjunction  with  the  Military  Occupa¬ 
tional  Development  Division,  MILPERCEN  has  explored  «  new  validation  and 
adaptation  of  Job  evaluation  methodology  in  developing  tentatively  proposed 
enlisted  grade  equations.  The  job  requirement  factor  ratings  given  by  125 
officers  (Captain  through  Colonel)  accurately  predict  the  yrade  assigned 
by  a  policy  board  of  75  non-cosmiissioned  officers  (E8/9)  selected  for 
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experience  and  thair  extensive  knowledge  of  enlisted  duty  positions.  The 
principal  equation  developed  predicts  enlisted  grade  by  making  use  of  the 
strong  relationship  between  quantitative  estimates  of  12  highly  reliable 
and  generally  valid  jot  factors  and  the  proposed  grade  for  200  well  selected 
"benchmark"  duty  positions. 

The  12  factor  and  5  factor  equations  provide  for  contemporary  imp-ow- 
ment  beyond  the  1967  equation  and  related  job  evaluation  materials .  The  job 
evaluation  factors  were  the  same  for  1977  with  the  exception  that  two  factors, 
Job  Sa'iofaction  and  Organizational  Setting  were  added,  received  positive 
weights,  and  Organisational  Setting  proved  a  very  strong  predictor.  From 
1967  to  1975  before  the  recently  completed  research,  minor  editing  and 
Improvements  were  Bade  on  the  original  10  job  factors  which  were  defined 
along  traditional  lines  to  be  compatible  with  other  DOD  approaches  acd 
Civil  Service  to  express  job  duties  in  terms  of  similar,  logically  derived 
scales  of  Job  requirements. 

Job  factor  methodology  as  presented  by  the  1977  equations  provides  a 
reasonably  valid  grading  and  decision-making  process.  Similar  factor  systems 
have  been  developed  by  and  updated  by  the  Civil  Service  Commission  (Anderson 
and  Sorts,  1971)  and  Air  Force  (Christal,  1975).  The  quadrennial  review  of 
military  compensation  (Pappas,  Fisher,  and  Martin,  1976)  sought  as  a  study 
objective  to  link  certain  military  jobs  with  their  civilian  equivalent  and 
did  not  appear  to  find  any  other  system  more  valid  than  the  job  factor  equation. 
Som  direct  use  of  task  inventories  may  offer  supplemental  data  for  assigning 
grade  with  a  basis  for  determining  the  grades  of  personnel  performing  similar 
sets  of  tasks.  The  consensus  of  expert  judgment  which  gives  validity  to  the 
job  factor  methodology  is  based  essentially  on  a  "voting  process,"  so  that 
any  later  adjustment  to  a  job  grade  results  likely  from  variables  such  as  MDS 
grade  structure  or  sow  other  grading  guidelines.  Comparative  analyses  of  the 
Armed  Services  and  waje  scale  linkage  in  the  civilian  economy,  or  even  grading 
policies  of  major  corporations  and  labot  unions,  can  provide  more  insight 
regarding  the  relationships  of  grade,  wage  scale,  and  overall  compensation. 

No  matter  which  line  of  grade  estimating  you  follow,  the  job  factor  approach 
will  provide  an  accurate  base  estimate  from  which  grade  determination  can 
proceed  so  rhat  analysts  ere  better  oriented  to  apply  required  policy  guidance. 

The  downgrading  equation  was  not  highly  significant  yet  points  the  way 
toward  designing  a  unique  policy  strategy  for  reacting  to  grade  constraints 
or  limitations.  This  procedure  along  with  a  firm  "rounding"  guideline  when 
averaging  position  grade  estimates  gives  a  more  conservative  and  slightly 
lower  grade  projection  that,  is  arbitrated  by  expert  analyst  judgment  using 
at  least  five  independent  job  ratings  and  Arny  policy  guidance. 

The  Job  Rating  Questionnaire  is  meant  to  offer  certain  supplemental 
information  related  to  grading  and  MOS  structure  design.  This  information 
can  help  in  modifying  selected  personnel  actions  so  that  the  career  patterns 
and  MT'S  management  are  more  related  to  the  policy  understanding  of  officer 
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and  enlisted  personnel  who  must  perform  the  operstior.nl  functions  of  personnel 
utilization.  The  strength  and  relatively  obvious  interpretation  of  equation 
and  questionnaire  results  did  not  recommend  further  research  to  differentiate 
possible  effects  of  rater  characteristics.  However,  after  continuing  purifi¬ 
cation  of  the  data  base,  a  multiple  R  of  over  .97  has  resulted  and  some 
explanation  of  riter  effects  will  follow  in  a  later  report. 

In  summary  ,  the  job  evaluation  methodology  was  refined  and  is  being 
revised  by  extended  Army  Research  Institute  effort  to  be  flexible  in  relation 
to  externally  Imposed  grade  constraints.  Policy  forecasting  questions  were 
explored  for  further  development  so  that  grade  requirements  can  adjust  to 
changing  policy  for  time- in-grade,  MOS  strength,  and  grade-pay  phasing. 

Carear  progression  design  can  have  a  more  equitable  basis  and  future  Ars«y 
enlisted  grading  can  apply  highly  rell.*ble  standards  when  deciding  job  value. 
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I .  INTRODUCTION 


The  Officer  Grade  Requirement#  (OGR)  research  project  was  designed  | 

to  assess  the  appropriate  grade  (lieutenant  through  colonel)  for  | 

each  of  the  62,000  jobs  in  the  total  non-aircrew  officer  force. 

The  job  evaluation  technology  which  was  developed  in  the  course 
of  the  OGR  project  is  a  systematic  and  reliable  sethod  which  can 
be  used  to  determine  grades  for  non-aircrew  jobs  based  on  Job  content 
and  responsibility.  The  present  updated  version  of  the  technology 
was  developed  foi  use  by  the  Air  Force  Management  Engineering 
Teaas  (METs)  to  be  employed  as  a  manpower  tool  in  assigning  grades 
t^  jobs. 


For  those  not  faailiar  with  the  Air  Force  occupational  structure, 
there  are  54  career  fields  or  job  groupings  for  officers  (excluding 
pilots  and  physicians)  which  are  made  up  from  over  180  Air  Force 
Specialties.  As  an  example,  Table  1  reflects  the  scientific  career 
field  guide  up  of  seven  jpecialties. 


Table  1.  Exasp le  of  Air  Force  Job  Structure 


One  Career  Field  Comprised  of: 

Scientific 

26XX 


Seven  Air  Force  Specialties 

2616  Scientif'.c  Manager 

2635  Physicist 

2645  Chealst/Biologist 

2655  Metallurgist 

2665  Nuclear  Research  Officer 

2676  Behavioral  Scientist 

2685  Scientific  Analyst 


II.  REVIEW  OF  OFFICER  GRADE  REQUIREMENTS  RESEARCH 


The  basic  technology  for  the  OGR  Job  evaluation  method  has 
been  under  development  and  refinement  for  over  15  years.  To  understand 
the  methods  and  procedures  used  in  the  present  study  *  brief  explanation 
of  the  initial  phase  of  the  OGR  (196.3-1966)  is  presented  below, 
prior  to  a  discussion  of  the  latest  job  evaluation  exercise  employed 
by  METs. 

Six  steps  were  Involved  in  the  first  phase  of  the  OGR  project. 
Figure  1  depicts  actions,  dates  and  Job  sample  sices  for  these  six 
steps. 

Step  1.  Collection  of  Job  Descriptions  (1963) 

The  first  step  of  the  project  Involved  collection  of  accurate  and 
detailed  job  statements  describing  the  work  performed  by  Air  Force 
officers  in  all  Air  Force  specialties.  Figure  1  shows  the  initial  1963 
Job  collection  of  79,750  officer  Jobs.  Each  job  incus&ent  provided  a 
job  title,  a  description  of  his  Job  in  the  Air  Force  organisational 
structure,  and  a  detailed  description  of  duties  and  tasks  performed. 
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The  incumbent  '*  supervisor  hm  uk«d  to  review  the  job  description 
for  accuracy  and  provide  a  judgment  concerning  the  appropriate  grade 
for  that  Job. 

Stop  2.  Selection  of  Criterion  Sample  (1963-1964) 

The  second  step  consisted  of  selection  of  3, STS  representative 
jobs  in  all  grades  across  all  suijor  air  comstands,  both  overseas  and 
in  the  continental  United  States.  Jobs  were  selected  fron  all  Air 
Force  Specialty  Codes  (AFSCs) . 

Step  3.  USAF  Policy  Board  Ratings  (1964) 

In  the  third  step,  »  policy  board  composed  of  22  colonels  was 
convened  by  HQ  USAF.  Policy  board  aeabers  were  selected  on  the 
basis  of  their  experience  in  various  career  areas  so  that  for  any 
of  the  3,575  criterion  sasple  jobs,  there  was  at  least  one  aaaber 
who  could  serve  as  an  expert  consultant  to  the  rest  of  the  board  in 
aaking  Judgments  as  to  the  appropriate  grade  (lieutenant  through 
colonel)  for  the  jobs.  Board  aeabers  were  Instructed  to  judge 
each  job  on  its  merits. 

In  quantifying  and  recording  the  board's  judgments  the  following 
measurer  were  taken: 


j 
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a)  Board  ambers  rated  the  appropriate  grade  level  for  a  job  and 
then  indicated  on  a  3-point  scale  their  level  of  confidence  In  such 
ratings.  They  were  given  access  to  any  Information  needed  to  make 
accurate  judgments.  This  included  consultation  with  other  members, 
obtaining  organizational,  coamand,  or  Installation  information  about 

a  job,  and  telephoning  special  air  staff  consultants  or  the  supervisor 
of  the  incumbent  of  the  job  being  rated.  However,  ambers  were 
advised  that  their  ratings  were  to  be  independent  and  were  to  reflect 
the  unbiased  judgment  of  the  rater  alone.  The  board  members  did  not 
have  knowledge  of  the  authorized  (Unit  Manning  Document-UKD)  grade 
for  the  job  being  rated  nor  of  the  grade  stated  by  the  incumbent's 
supervisor.  They  were  not  informed  of  the  grade  held  by  the  incumbent 
In  the  job,  nor  were  grade  ratings  assigned  by  other  board  ambers 
available  to  the  rater. 

b)  Grade  ratings  for  particular  Jobs  were  obtained  Independently 
fica  five  separate  board  members  since  earlier  Air  Force  Job  evaluation 
rnsearch  Indicated  that  an  average  of  five  Independent  ratings  provided 
stable  estimates  for  job  evaluation  purposes. 

c'l  Each  job  was  rated  in  a  context  of  other  Jobs  since  earlier  job 
evaluation  research  on  context  effects  suggested  that  more  accurate 
ratings  of  Job  level  are  obtained  when  a  Job  is  considered  with  other 
Jobs  of  varying  content  and  level. 

d)  Board  members  rated  grade  requirements  using  a  16-point  rating 
scale  consisting  of  three  levels  of  experience  within  each  grade  from 
lieutenant  through  colonel,  and  one  level  for  general.  This  scale 
was  based  on  findings  that  ratings  are  sore  stable  when  judges  made 
the  finest  discriminations  of  which  they  are  capable,  and  the  assumption 
that  experienced  officers  can  distinguish  Jobs  requiring  high,  moderate, 
or  low  levels  of  experience. 

Step  4.  Analysis  of  Policy  Board  Actions  (1966) 

Analysis  of  the  policy  board  rating  data  was  a  critical  part  of 
the  project  since  these  ratings  formed  the  basis  for  establishing 
grade  requirements.  A  series  of  analyses  was  accomplished  to  determine 
if  the  grade  ratings  were  stable;  if  there  waa  high  agrement  among 
board  members  concerning  the  appropriate  grade  requirements  for  parti¬ 
cular  Jobs;  if  the  raters  had  confidence  in  their  ratings;  and  if  tha 
raters  were  biased  for  or  against  jobs  in  various  AFSCs  or  commands. 

Main  results  from  these  analyses  were  as  follows. 

a)  The  reliability  coefficient  (.92)  of  the  mean  grade  ratings  aa 
given  by  the  policy  board  indicated  there  waa  high  agrement  among 
board  members  concerning  grade  requirements  for  jobs  in  the  criterion 
sample. 
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b)  Based  on  a  3-point  scale  of  confidence,  (Is  low  to  3  »  high) 
board  members  expressed  a  high  level  of  confidence  in  ratings  with 
at  least  4  or  5  raters  expressing  the  highest  level  of  confidence 

in  their  ratings  of  2,389  of  the  3,575  jobs.  In  fact,  only  59  of  the 
jobs  had  a  confidence  rating  of  less  than  2. 

c)  Analysis  designed  to  identify  raters  showing  a  bias  for  or 
against  Jobs  by  command  or  occupational  grouping  revealed  that  Individual 
board  members  did  not  exhibit  a  bias  towards  jobs  in  any  particular 
command  or  AFSC.  The  highest  reported  disagreement  among  board  Mabers 
varied  from  the  mean  by  only  1.7  points  on  a  scale  from  1  to  16,  with 

no  at  values  being  less  than  1.  The  maximum  level  of  disagreement 
msong  raters  varied  by  less  than  1/2  of  a  grade. 

d)  Additional  analyses  Indicated  that  board  members  agreed  that 
many  jobs  were  inappropriately  graded  and  that  each  job  was  considered 
on  its  own  merits.  Comparison  of  UHD  versus  board  grade  revealed 

no  systematic  tendency  on  the  part  of  raters  to  confirm  current  UKD 
grade  authorisations  or  to  Inflate  vheir  ratings  of  grade  requirements. 
Many  jobs  were  downgraded  as  much  as  one  or  two  full  grade  levels, 
while  others  were  upgraded.  Reliability  analyses  indicated  there  was 
strong  agreement  among  board  members  as  to  which  particular  jobs  should 
be  upgraded  and  downgraded. 

Step  5.  Development  of  Policy  Equation  (1964) 

The  fifth  step  of  the  initial  OGR  development  entailed  the  develop¬ 
ment  of  a  multiple  linear  regression  equation  which  "captured"  the 
policy-making  grade  decisions  of  the  criterion  board  using  several 
variables  with  judgments  obtained  from  1,246  field  grade  raters  on 
the  3,575  job  set.  A  high  agreement  among  board  leembers  concerning 
the  appropriate  grade  for  jobc  Indicated  they  considered  similar  factors 
in  making  their  decisions.  In  developing  the  equation,  over  200  predictor 
variables  and  350  (egression  problems  were  considered.  As  a  result 
of  these  analyses,  predictors  were  eventually  narrowed  down  to  9  variables 
involving  1)  special  training  and  work  experience,  2)  communication 
skills,  3)  judgment  and  decision  making,  4)  planning,  5)  management, 
6  &  7)  two  levels  of  organization  information,  8)  the  supervisor's 
judgment  of  grade  for  the  job  (lieutenant  through  general),  tnd  9) 
the  field  grade  judge  rating. 


Step  6.  Application  of  Policy  Equation  (196S) 

The  last  or  sixth  step  in  this  initial  phase  of  job  evaluation 
developed  consisted  of  applying  the  grade  policy  equation  to  an 
additional  8,250  job  descriptions  rated  by  field  grade  raters  (see 
Figure  1).  Together  with  the  3,575  policy  board  ratings  from  jobs, 
appropriate  grades  for  over  11,000  officer  jobs  had  been  determined. 
From  this  large  base  it  was  possible  to  project  to  the  grade  structure 
of  tha  entire  Air  Force  consisting  of  105,908  jobs,  and  to  compare 
projected  grade  requirement  statements  with  the  total  Air  Force 
authorizations  (UMD) . 


Results  indicated  that  changes  were  necessary  in  many  career 
fields  in  order  to  align  grades  with  jobs  based  on  job  content 
and  responsibility  levels  as  stated  by  OGR.  For  some  career  fields 
such  as  Research  and  Development,  Develops* ntal  Engineering  and 
Financial  the  OGR  study  Indicated  that  too  few  colonel  positions 
were  available;  while  in  other  fields  such  as  Education  and  Training, 
Information,  and  legal,  too  many  colonel  positions  were  allocated. 
Overall,  many  jobs  were  found  to  be  undargraded,  especially  at  the 
rank  of  major  where  greater  content  and  responsibility  levels  existed 
than  what  IMD  authorisations  indicated. 


Development  of  Bench  Hark  Rating  Scales 

In  addition  to  the  above  technology,  in  1966  benctaark  scales 
were  developed  for  the  OGR  job  evaluation  factors  of  special  training 
and  work  experience,  communication  skills,  judgment  and  decision  making, 
planning,  and  management.  Scales  range  from  1  to  9  with  three  job 
titles  chosen  to  represent  eacit  of  the  nine  increasing  skill  levels. 

These  scales  allow  more  consistent  and  reliable  judgments  to  be  aiade 
by  comparing  isolated  or  single  jobs  with  the  benchmarked  jobs  of 
the  scale.  The  scales  were  validated  using  1,000  officer  positions. 

The  job  grades  obtained  with  benchmark  scales  compared  with  the  Policy 
Board  were  In  high  agreement  (r  •  .90).  An  example  of  a  current  benchmark 
scale  for  the  management  factor  is  presented  In  Table  2. 

T*>la  2.  M  wcwplt  of  a  beoch—Tt  seal*/. 


MANAGEMENT:  Hu-  level  of  executive,  and  managerial  tktlli  requited  in  ihc  job 
i  tro-i  mk*  v,«np)c'uty . » «<ei>  jinl  level  of  the  activiticr  which  arc  dirccled.  orpm/cd, 
.txifUiiyial.  i  uni  rolled,  commanded,  oi  evaluated 

LEVEL  9 

Director  of  Budget.  Itq  Majot  Ait  Command 
0"«n«i»wndc\  ('••mbit  Support  Cp  (Oveneax) 

Wmg  Commander,  Tactical  Conliut  Wg  (Ove neat) 

e 


LEVEL 7 

Admiimtrative  Officer,  Aw  Bate  Si) 

Data  Vivtcex Officer,  ('nmbst  Support  Gp 
T»Mi< at  I  iphtcr  Pilot. Tactics)  Fighter  Sq 

LEVlt  I 

«  link  d  f\y  Urologist.  USAF  Ikiqnial 
Ikycbiainc  Social  Worker,  USAF  Ihmpital 
Ikikupu-r  Pilot  Single  Rotor  Ait  few  Sq 
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III.  DEVELOPHEKT  OF  JOB  EVALUATION  SYSTEM  FOR  MET'* 


The  first  development  of  the  technology  using  MET  personnel 
began  in  1974.  Previously  the  technology  had  employed  field  grade 
officers  as  raters.  The  1974  pilot  test  confirmed  that  MET*  could 
also  apply  the  technology  with  accuracy  and  consistency. 


The  process  used  in  the  determination  of  grades  is  presented 
in  Figure  2.  The  basic  process  involves  seven  actions  from  the  tisie 
the  job  description  is  filled  out  until  the  appropriate  grade  for 
the  job  is  assigned.  After  the  supervisor  makes  r,  grade  judgment, 

MET  raters  make  judgsents  on  five  grade  factors  with  the  aid  of  benchmark 
scales  which  were  discussed  previously.  Wh*n  this  information  Is  com¬ 
bined  with  organization  information,  resulting  scores  can  be  converted 
to  a  gr.-ids  which  reflects  the  job  content  and  responsibility  level 
of  the  Jet. 


i 

Fifww  J.  It*  frali  ptmemt. 


Job  descriptions  were  collected  by  MET  members  and  rated  on 

job  factors.  In  the  1974  pilot  test,  1,68?  job  descriptions  were 
collected  to  test  the  feasibility  of  MET  application  of  the  OCR 
system.  In  addition,  485  of  the  original  3,575  criterion  board 
Jobs  were  rated  by  MET  members  to  assess  the  vnlidlty  of  the  OGR 
technique.  Reading  the  crossbreak  horizontally  in  Table  3  ahowe 
2,172  Jobs  rated  in  the  1974  pilot  study,  consisting  of  485  policy 
board  Jobs  and  1,687  current  Air  Force  officer  Jobs  which  were  collected 
by  HETs.  Successful  completion  of  the  pilot  test  indicated  that 
METs  could  systematically  and  reliably  determine  grade  requirements 
for  jobs,  and  a  full  field  test  was  therefore  undertaken  to  refine 
and  expand  this  Job  evaluation  technology.  Samples  from  the  1974 
pilot  atudy  were  incorporated  into  the  crossbreak  design  of  the  larger 
effort  as  shown  in  Table  3.  In  the  1976  study,  as  in  the  1974  pilot 
study,  two  kinds  of  jobs  were  rated.  One  set  consisted  of  jobs  from 
the  original  1964  policy  board  set  and  the  other  consisted  of  current 
Air  Force  job  descriptions  collected  during  the  study. 
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Reading  che  croaabreak  vertically,  rather  than  horizontally,  it 
•ay  Le  seen  that,  of  the  policy  board  jobs,  a  total  or  1,725  criterion 
jobs  were  rated  by  HET  wnabers.  In  addition,  over  11,000  new  jobs 
were  collected  and  rated  in  this  phase  of<  the  OGR.  This  11,000  case 
staple  was  used  as  the  base  from  which  projections  were  made  to  the 
total  non-aircrew  force  of  over  62,000  Air  Force  officer  positions. 

In  all*  ICTs  awaleatod  swr  11*000  jete  la  this  not  natal t  aatartakiag 
of  dw  0ft  research  pnjact. 


TabU  I. 


pie  Crossbreak  of  Subsets  and  Total 
i>er  of  Jobs  Used  In  Job  Evaluation 


Current  A.F.  Jobs 


Totals 

N  »  2,172 

(7) 

N  -  10,874 

• 

(8) 

N  -  1,725  (3J  I  N  -  11,321  (6)  |  M  -  13,046  (9) 


(i)  to  (3)  Policy  Board  Jobs  used  an  the  criterion  in 

coaputing  validity  and  construction  of  the 
grade  conversion  table 

(4)  to  (6)  Currant  Jobs  newly  collected  vised  as  the  base 

for  asking  projections  to  the  total  non-aircrew 
force 

\?)  and  (8)  Rating  subset  inforauit ion  used  for  reliability 
coaputationa 

(9)  Grand  total  nuaber  of  all  jobs  used  in  the  study 

The  1,725  criteria*  angle  sat  of  jdo  naatetad  at  Ms  fbra 
no* -aircrew  Jobe  nidi  ere  still  ia  the  Air  Faroe  tedaf,  lla  angle 

was  used  to  derive  a  now  set  of  regression  equation  weights  for 

the  Policy  equation  and  to  construct  a  conversion  table  for  converting 

composite  scores  to  military  grades. 


The  second,  current  Job  ample,  vas  used  to  project  grade  re¬ 
quirements  to  the  total  non-aircrew  x’orce .  Sampling  specifications 
were  established  for  the  122  Mira,  with  Jobs  stratified  across  Air 
Force  specialties  and  across  grades  (lieutenant  through  colonel)  in 
order  to  assure  that  descriptions  were  representative  of  the  total 
non-aircrew  force.  Sample  specifications  were  based  on  a  December 
Doit  Detail  Lietirq  (DDL  -  present  version  of  the  IMD) .  Over  900 
MET  raters  from  all  over  the  world  participated  in  the  study. 

Analyses  of  1,725  Criterion  Jobs 

Analyses  resulted  in  an  8  variable  multiple  regression  equation 
using  the  five  factors,  two  level  of  organisation  variables  and  the 
supervisor’s  Judgment.  When  compared  to  the  grsdss  assigned  by  the 
Policy  Board  to  the  same  Jobs  the  Policy  Equation  reflects  a  validity 
coefficient  of  .90.  Individual  validities  for  each  of  the  variables 
are  reported  in  Table  4.  Based  on  these  findings,  the  1,725  Job  set 
was  used  to  construct  a  stable  conversion  table  to  convert  the  composite 
scores  obtained  from  the  application  of  the  regression  equation  to  a 
military  grade.  This  permits  the  evaluation  of  a  Job,  using  the  grade 
equation  to  provide  a  score,  and  a  conversion  of  the  resulting  score 
into  the  appropriate  officer  grade  required  for  the  job. 


Table  4.  Validity  of  Grade  Predictors  In  The  Grade  Equation  for  the  OGR 


Variable _  Validity 


Special  training  and  work  experience  .65 
Communication  skills  .72 
Judgment  and  decision  making  .74 
Planning  .78 
Management  .79 
Level  of  organization  .50 
Level  of  job  within  organization  .47 
Supervisor's  judgment  of  appropriate  grade  .78 
Final  grade  composite  score  .90 


Based  on  an  average  of  13.61  ratings  per  job 
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By  usin£  the  grade  equation,  this  method  will  assign  the  esse 
grade  to  each  non-aircrew  position  as  would  have  been  assigned  by 
the  Policy  Board  ambers.  Table  5  indicates  the  variables  of  th< 
equation  and  their  associated  weights  used  in  determining  a  composite 
score.  The  mean  values  of  5  Job  evaluation  factors  rated  on  benchmark 
scales  are  weighted  and  combined  with  the  two  organization  variables 
and  the  supervisor's  judgment. 


Table  5.  Application  of  the  Policy  Equation 


GRADE  DETERMINANTS 

JOB 

VALUE 

WEIGHT 

WEIGHTED 

COMPONENT 

Special  training  and  work  experience 

X 

1 

I 

Communication 

X 

1 

I 

Judgment  and  decision  making 

X 

A 

I 

Planning 

X 

1 

f 

Management 

X 

3 

# 

Level  of  organization 

JD* 

1 

# 

Level  of  job  within  organization 

JD* 

1 

f 

Supervisor'^  Judgment  of  appropriate 
grade 

JD* 

** 

9 

COMPOSITE 

*  from  Job  description 

SCORE 

**  +  or  -  depending  on  grade  colonel  through  lieutenant 


Analyses  of  11,321  Current  Jobs 

To  examine  how  well  the  MET  raters  agreed  among  themselves 
as  to  a  Jobs  content  and  responsibility,  inter-rater  reliability 
coefficients  were  computed.  Based  on  6.94  ratings  per  Job,  the 
intra-class  coefficient  resulted  in  r^  •  .97.  This  coefficient 
indicated  very  high  agreement  among  the  raters  snd  indicated  that 
if  the  11,000  jobs  were  evaluated  by  another  group  of  raters,  similar 
results  would  be  obtained. 


198 


IV.  PROJECTIONS  TO  THE  TOTAL  NON -AIRCREW  OFFICER  FORCE 


Based  on  Che  preceding  findings  end  indications,  it  was  possible 
to  project,  with  considerable  accuracy,  the  results  which  would  be 
obtained  if  the  METs  were  to  apply  OGR  technology  to  the  entire 
non-aircrew  force.  The  11,321  sample  was  representative  and  large 
enough  (18  percent  of  the  force)  to  provide  stable  projections  of 
the  various  career  utilization  fields  for  the  62,602  positions  listed 
on  the  December  1975  Manpower  and  Organization  UDL  (MAO  UDL) . 

Table  6  presents  the  overall  results  from  this  projection  exercise, 
comparing  the  authorized  manpower  statements  of  the  UDL,  OGR  state¬ 
ments  of  grade,  and  the  Decesiber  1975  actual  on-board  manning  strength 
as  reflected  in  the  Uniform  Officer  Records  (UOR) .  As  may  be  seen, 

OGR  calls  for  a  decrease  in  the  nisabtr  of  currently  authorized  colonels, 
an  Increase  in  lieutenant  colonels,  a  very  large  increase  in  majors, 
and  a  significant  decrease  in  the  authorizations  for  captains  and 
lieutenants.  In  summary,  an  overall  increase  in  the  number  of  field 
grade  officers  would  be  needed  to  match  the  grades  of  the  incumbents 
with  the  content  and  responsibility  levels  of  Air  Force  non-aircrew 
officer  Jobs. 


Table  6.  Projections  to  the  Total  Non-Aircrew  Force 


MAO  PRESENT  GRADE 

UDL  OGR  DEC  1975  -  UOk 


Grade 

N _ 

X 

N 

X 

Difference 

N 

X 

COL 

4,739 

7.57 

4,276 

6.83 

-463 

4,306 

6.53 

LTC 

10,358 

16.55 

11.00C 

17.57 

+642 

10,153 

15.38 

M/U 

13,744 

21.95 

19,204 

30.68 

+5,460 

14,307 

21.67 

CAPT/LT 

33,761 

53.93 

*8,122 

44.92 

-5,639 

37,248 

56.42 

62,602 

62,602 

66,014 

In 

addition 

to  the 

needed  increase 

in  aiajors, 

other  conditions 

found  to  exist  based  on  projection  data.  Examining  the  projections 
for  the  54  career  fields  it  was  noted  that  some  career  fields  were 
over  graded  at  various  grade  levels  and  Chat  some  were  undergraded. 
For  nxamplj,  turning  again  to  the  scientific  career  field,  it  may 
be  noted  in  Table  7  that  many  more  majors  and  lieutenant  Colonel 
authorization?  are  needed  for  scientific  jobs  based  on  the  content 
and  responsibility  levels  of  the  work  performed. 


Table  7.  AFSC:  26XX  TITLE:  SCIENTIFIC 


H&O  PRESENT  GRADE 


Grade 

UDL 

N  Z 

N 

0GR 

X 

Difference 

DEC  1975  -  UOR 
N  Z 

COL 

12 

1.07 

6 

0.53 

-6 

6 

0.51 

LT  COL 

102 

9.09 

141 

12.57 

■{■39 

98 

8.28 

MAJ 

273 

24.33 

504 

44.92 

+231 

257 

21.71 

CAPT/LT 

735 

65.51 

471 

41.98 

-264 

823 

6?.  51 

1,122 

1,122 

1,184 

V. 

CONCLUSIONS 

The 

officer  grade 

requirements 

(OCR)  project  has  been  one  of 

Che  note  extensive  job  evaluation  research  projects  in  existence. 

In  the  course  of  over  IS  years  of  research  and  development  the  techno-vify 
was  used  to  detersine  the  appropriate  grades  for  11,825  Jobs  baaed 
on  individual  ratings  in  1963-1966.  In  Che  recent  development  of 
the  technolojty  employing  Management  Engineering  raterc,  13,046  individual 
Jobs  were  rated.  Taken  together  this  constitutes  over  24,000  Jobs 
which  have  bt\sn  evaluated  using  the  OCR  technology.  When  the  1966 
projections  to  105,908  Jobs  are  coupled  with  the  1974-1976  projections 
to  62,602  John  this  results  in  over  168,000  Jobs  having  been  considered 
by  the  0GR  Job  evaluation  system. 

The  basic  components  of  the  OCR  system  consist  of  five  job 
factors  (special  training  and  work  experience,  communication  skills, 
judgment  and  decision  making,  planning,  and  manageaient)  judged 
with  benchmark  scales,  two  organization  variables,  and  the  supervisor's 
Judgment  of  the  appropriate  grade.  When  entered  into  a  multiple 
linear  regression  equation,  these  variables  produce  a  composite 
score  which  can  be  converted  into  an  appropriate  solitary  grade 
from  lieutenant  through  colonel. 

When  applied  by  the  METs,  this  technology  will  provide  a  systematic 
and  reliable  device  for  determining  officer  grade  requirements  based 
upon  job  content  and  responsibility  associated  with  specific  jobs. 

Note:  This  paper  has  drawn  freely  from  various  Air  Force  Technical 
Reports  regarding  the  OCR  research.  A  list  of  key  reports  iw  provided 
in  the  references. 
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JOB  EVALUATION  OF  A  LINKAGE 
TECHNOLOGY  1 


Objective 

This  study  was  a  research  project  conducted  for  the  1975 
Quadrennial  Review  of  Military  Compensation  (QRMC) .  The  project 
was  designed  to  test  the  validity  of  previously  developed  job 
linkages  used  to  compare  earnings  for  "equivalent"  military 
and  Civil  Service  job  levels  and  occupations.  The  study  also 
provided  a  demonstration  of  a  method  for  evaluating  job  con¬ 
tent  (difficulty)  which  could  be  used  both  to  determine  linkages 
and  relative  positioning  of  military  and  Civil  Service  occupations. 
Further,  the  methodology  provides  a  ready  mechanism  for  a  direct 
comparison  of  the  difficulty  of  military  occupations  to  the 
difficulty  of  jobs  in  the  private  sector. 

The  initial  objective  of  the  project  was  to  test  the  linkages 
developed  in  the  first  QRMC  conducted  in  1967,  as  well  as  other 
proposed  linkages.  This  analysis  included  a  sample  of  military 
(combat  arms)  Jobs  for  which  no  counterpart  exists  in  the  civilian 
economy  and  which  were  not  evaluated  by  the  1967  QRMC.  (The 
combat  arms  jobs  were  not  evaluated  bv  the  1967  QRMC  due  to  the 
limitations  of  the  job  evaluation  techniques  employed.)  Analyses 
were  also  maae  of  pay  grades  in  the  Civil  Service  to  evaluate 
their  job  content  and  relative  position  with  respect  to  selected 
military  pay  grades. 

It  should  be  noted  that  pay  comparability  between  the  military 
and  Civil  Service  and  the  internal  equity  of  military  compensation 
were  not  examined  in  this  study.  Instead,  the  focus  of  this 
study  was  on  the  measurement  of  job  content  in  military  and 
Civil  Service  occupations.  Comparisons  were  made  between  the 
military  service  and  the  Federal  Civil  Service  with  respect  to 
the  job  content  within  selected  pay  grades  of  these  two  job 
classification  systems. 


1  This  presentation  is  derived  from  the  Executive  Summary  ' 

of  a  study  conducted  for  the  1975  Quadrennial  Review  of  j 

Military  Compensation  by  Hay  Associates  under  contract  { 

number  M0\  903-76-C-0018,  An  Analysis  of  Selected  Linkages  1 

Between  Military  anl  C i v i l’Tc r v fee  Qcc upaTTonsT  Linda  D. 

FappasT  TITa  iFIfi  ‘  TT  sWe? ,  J  77 ,"  alTJ"n-7inTR;'1^rt  in ,  Jr.,  ! 

April  1976  j 


The  task  of  establishing  the  relative  positioning  of  pay 
grades  in  the  Military  service  and  the  Federal  Civil  Service 
entailed  the  use  of  the  following  strategy  for  research:  (a) 
sampling  procedures  were  used  to  insure  that  positions  with 
large  numbers  cf  incumbents  were  represented  at  each  pay  grade 
so  that  the  sample  jobs  represented  the  norm  of  job  content  at 
each  pay  grade;  (b)  jobs  at  each  level  were  evaluated  using  a 
common  (single)  standard  of  measurement,  the  Hay  Method,  to  permit 
comparative  analyses  to  be  made;  and  (c)  the  results  of  the  job 
evaluation  were  subjected  to  both  visual  inspection  and  statis¬ 
tical  analysis  to  evaluate  linkages  and  determine  the  relative 
positioning  of  pay  grades  in  the  military  and  Federal  Civil 
Service  job  classification  systems. 

Details  of  the  approach  are  summarized  below. 

Approach 

The  approach  involved  the  selection  of  a  representative 
sample  of  military  and  Federal  Civil  Service  positions.  A  two- 
stage  sampling  procedure  was  employed. 

First,  certain  military  pay  grades  and  Civil  Service  pay 
grades  were  selected.  A  total  of  seven  (7)  military  pay  grades 
and  thirteen  (13)  Civil  Service  pay  grades  were  represented. 

The  seven  military  grades  represent  a  broad  spectrum  of  the 
military  pay  structure.  The  military  grades  were  E-3,  E-5, 

E-7,  0-1,  0-2,  0-5,  and  0-8.  These  grades  include  approximately 
411  of  the  total  Armed  Forces  population.  The  thirteen  Civil 
Service  pay  grades  were  GS-3,  OS-5,  GS-7,  GS-9,  GS-14,  GS-1S, 
GS-18,  WG-S,  WG-6,  *G-8,  WG-10,  WS-9  and  WS-10.  These  pay 
grades  represent  approximately  461  of  ail  General  Schedule 
(white-collar)  employees  and  541  of  all  Wage  Grade  (blue-collar) 
employees. 

Second,  occupations  were  sampled  at  each  of  these  military 
and  Civil  Service  pay  grades.  For  both  the  military  and  the 
Civil  Service,  jobs  were  purposely  selected  to  represent 
occupations  with  large  numbers  of  incumbents.  Hence,  jobs  were 
selected  to  best  represent  the  typical  jobs  at  each  pay  grade. 

The  military  j >h  sample  fas  also  selected  to  represent  each  of 
the  Armed  Services.  Further,  the  military  sample  included  jobs 
in  each  DOD  occupational  category  including  DOD  category  ”0" 
(Combat  Arms)  occupations.  Jobs  in  this  category  were  not 
evaluated  by  the  1S67  QRMC,  because  these  purely  military 
occupations  caunot  be  evaluated  by  Civil  Service  standards. 
However,  the  military  job  sample  in  the  present  study  excluded 


jobs  for  which  the  incumbents  were  in  training. 

The  military  and  Civil  Service  job  samples  were  reviewed 
for  representativeness  by  the  QRMC  and  the  U.S.  Civil  Service 
Commission,  respectively.  It  was  determine d  that  the  samples 
constituted  a  fair  representation  of  jobs  at  each  pay  grade. 

A  total  of  140  military  jobs  and  193  Civil  Service  lobs 
were  included  in  the  sample,  for  a  grand  total  of  333  jobs. I 
Although  the  number  of  jobs  sampled  may  appear  to  be  small, 
the  number  is  quite  large  compared  to  other  surveys,  e.g.,  the 
PATC  Survey. 2  Further,  these  jobs  were  purposely  selected  to 
represent  occupational  specialties  with  large  numbers  of 
incumbents,  and  hence  to  provide  the  basis  for  estimating  the 
typical  job  content  at  each  pay  grade.  In  total,  the  sample 
positions  include  48%  of  the  military  population  in  the  seven 
selected  military  pay  grades,  and  42%  of  the  Civil  Service 
population  in  the  thirteen  selected  Civil  Service  pay  grades. 

This  simple  site  is  adequate  for  estimating  the  typical  difficulty 
of  jobs  at  these  pay  grades  in  the  military  and  Federal  Civil 
Service. 

Written  job  description  materials  provided  by  the  Armed 
Forces  and  the  U.S.  Civil  Service  Commission  were  used  to  de¬ 
scribe  the  content  of  each  job.5  Similar  materials  were  used 
by  the  1967  QRMC  in  their  evaluations.  Existing  written  job 
description  materials  were  adequate  for  purposes  of  this  study. 

The  quality  of  the  materials  did  vary  in  terms  of  relevance  and 
completeness.  For  example,  it  was  found  that: 


1  The  term  "jobs”  is  used  throughout  this  report  to  describe 
occupations  and  skill-level  combinations. 

2  in  the  1974  PATC  Survey,  84  necupation/ski! 1  level  combinations 
comprised  the  sample  used  to  estimate  Federal  Civil  Service 

pav  requirements  for  jobs  classified  under  the  General 
Schedule  System. 

■'  evaluation  of  0-2  positions  was  an  exception,  since  job 
descriptions  unique  to  the  0*2  pay  grade  are  generally 
unavailable. 
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•  The  job  description  information  was  generally  not 
integrated.  For  example,  military  Tables  of  Organi¬ 
zation  (T/O's)  were  separate  from  the  job  descriptions; 

•  The  accountability  dimension,  which  has  key  importance 
in  the  Hay  evaluation  system,  was  not  well  specified  in 
■any  of  the  job  description  materials.  For  example, 

the  military  job  descriptions  did  not  report  the  numbers 
of  men  supervised  or  the  budget  controlled  by  an  officer; 

e  For  some  positions,  the  job  description  materials  appeared 
to  exaggerate  the  content  of  the  job.  For  example,  some 
of  the  Federal  Civil  Service  classification  standards 
listed  all  duties  an  incumbent  might  perform,  not  the 
actual  duties  performed. 

•  In  contrast,  the  job  descriptions  for  some  positions  were 
so  brief  that  it  was  difficult  to  understand  the  full 
range  of  job  content.  For  example,  the  military  job 
descriptions  for  some  officer  occupations  were  only  one 
paragraph  long. 

However,  these  problems  were  overcome  in  the  course  of  the  project. 
For  example,  the  job  description  materials  were  supplemented  as 
needed  with  additional  clarifying  information  as  to  the  purpose, 
nature,  scope  and  dimensions  of  the  positions.  Military  Tables 
of  Organization  and  information  on  unit  size  were  used  to 
complement  the  military  job  descriptions.  Position  classification 
experts  in  the  ll.S.  Civil  Service  Commission  clarified  the  problems 
noted  for  some  Civil  Service  occupations. 

It  is  important  to  note  that  the  evaluations  of  the  military 
occupations  in  this  study  may  be  conservative,  in  that  each 
military  job  was  evaluated  assuming  a  peacetime  environment. 

This  assumption  was  also  made  by  the  1967  QRMC.  As  such,  combat- 
related  aspects  of  military  jobs  were  not  considered.  No  weight 
or  score  adjustment  was  made  to  the  military  jot»  evaluation 
scores  to  reflect  these  and  other  conditions  unique  to  military 
service  which  increase  the  difficulty  of  military  occupations. 

An  Evaluation  Committee  of  experienced  private  sector 
consultants  was  formed  to  evaluate  the  333  positions.  These 
consultants  had  prior  work  experience  in  either  the  military 
service  or  the  Federal  Civil  Service.  They  also  had  extensive 
job  evaluation  experience  in  the  private  sector. 

in  this  application  of  the  Hay  Method,  job  evaluators  rated 


each  military  and  Civil  Service  job  based  on  an  understanding 
of  the  nature  of  the  job.  These  ratings  were  made  using  Guide 
Charts  which  represent  the  three  factors  of  Know-How,  Problem 
Solving  and  Accountability  inherent  in  each  job.  These  three 
factors  are,  in  turn,  defined  by  ei^ht  dimensions.  Know-How  is 
defined  by:  (1)  the  extent  of  knouledge  required  by  the  job; 

(2)  breadth  of  manageria1  skills;  and  (3)  human  relations 
requirements.  Problem  Solving  is  defined  b/:  (1)  the  degree 
of  original  thought  required  on  the  job;  (2)  the  degree  of 
limitations  imposed  on  thinking.  Accountability  is  the  impact 
of  the  job  on  end  results  and  is  defined  by:  (1)  the  extent  of 
freedom  to  act  in  the  job;  (2)  the  degree  of  primary  (vs.  shared) 
accountability  in  the  job;  and  (3)  the  magnitude  (size)  of  the 
job  expressed  in  terms  of  resources. 

In  addition  to  the  above  factors,  consideration  was  given 
to  Working  Conditions  when  rating  blue-collar  non- supervisory 
positions  in  the  military  and  Civil  Service.  Working  Conditions 
are  defined  by  three  dimensions:  (1)  the  extent  of  physical  effort 
required  by  the  job;  (2)  the  extent  of  exposure  to  hazards;  and 

(3)  the  quality  of  the  environment,  e.g.,  extent  of  exposure 
to  noise,  fumes,  heat,  dirt,  etc. 

The  following  sequential  multi-stage  job  evaluation  process 
was  employed: 

e  Independent  ("raw  score')  evaluations  of  each  job  were  made. 

e  Clarification  score  evaluations  were  determined  in 
comr.it  tee. 

e  Consensus  score  evaluations  were  determined  in  committee. 

e  "Soro-Thumbing"  consisting  of  an  internal  consistency 
review  of  consensus  scores  was  done. 

e  Sponso-  review  of  the  evaluations  was  made.  The  QRMC 
staff  reviewed  evaluations  of  military  postions. 
Representatives  of  the  U.S.  Civil  Service  Comaitsion 
reviewed  evaluations  of  the  Civil  Service  jobs;  a  repre¬ 
sentative  of  the  Standards  Division  reviewed  evaluations 
of  the  General  Schedule  positions;  a  representative  of 
the  Trades  and  Labor  Section  reviewed  evaluations  of  Wage 
Grade  a.jd  Wage  Supervisory  positions. 


20/ 


The  evaluation  data  were  compiled,  resulting  in  one  final 
total  point  score  for  each  of  the  533  positions.  Summary  data 
were  computed,  e.g.,  the  median  (typical)  job  difficulty  at  a 
particular  pay  grade.!  The  data  were  then  inspected  to  determine 
the  relative  positions  of  the  military  pay  grades  with  respect 
to  the  various  Civil  Service  pay  grades.  A  series  of  statistical 
tests  were  also  made  of  proposed  linkages  between  pay  grades  in 
the  military  and  Federal  Civil  Service. 

It  should  be  noted  that  attempts  to  relate  pay  grades  in 
the  military  service  to  pay  grades  in  the  Civil  Service  are 
complicated  by  the  fact  that  different  numbers  of  grades  (levels) 
exist  in  these  two  classification  systeas.  A  one-to-one  mapping 
of  pay  grades  between  the  two  systems  is  impossible.  For  example, 
the  eight  officer  grades  (0-1  through  0*8)  do  not  line  up  with 
the  ten  General  Schedule  grades  (GS-7,  9,  11,  12,  13,  14,  IS, 

16,  17,  and  18).  Further,  the  nine  enlisted  grades  do  not  line 
up  with  the  fifteen  Wage  Grades  or  the  nineteen  Wage  Supervisory 
grades,  or  the  seven  General  Schedule  pay  grades  (GS-3  through 
GS-9).  For  this  reason,  attempts  to  relate  the  different  systeas 
require  that  estimates  be  made  of  the  relative  position  of  pay 
grades  in  one  system  compared  to  pay  grades  in  the  other  system. 
Thus,  exact  linkages  should  not  be  expected  and  relationships 
like  E-3  between  WG-5  and  WG-6  should  be  anticipated  and  readily 
accepted. 

Results 

A.  Enlisted  level 

The  present  study  analysed  the  relationships  of  job  content 
(difficulty)  between  military  enlisted  occupations  at  the  E-3, 

E-S,  E-7  levels  and  Civil  Service  jobs  under  both  the  Civil 
Service  white-collar  (General  Schedule)  system  and  the  blue-collar 
(Wage  Grade/Wage  Supervisor)  system. 


A . 1 .  Relat ionships  Between  Pay  Grades  in  the  General  Schedule 
and" 'the  Enlisted  Cradles  of  the  Military  Service 


The  ladder  ir,  Figure  i  shows  the  relationship  of  jobs  in 


1  The  median  is  a  measure  of  central  tendency.  Half  the  jobs 
are  more  difficult  than  the  median  value;  half  the  jobs 
arc  less  difficult. 
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selected  levels  of  the  Civil  Service  General  Schedule  system  fj 

and  jobs  in  the  military  pay  grades  of  E-3,  and  £-7.  j 

In  terns  of  relative  job  content,  sample  jobs  in  the  1 

enlisted  grades  were  evaluated  as  possessing  increased  I 

difficulty  at  each  higher  level.  Sample  E-7  jobs  were  1 

evaluated  as  more  difficult  than  sample  E-S  jobs.  Sample  E-5  * 

jobs  were  evaluated  as  more  difficult  than  sample  E-3  jobs.  Job 
content  also  varied  by  pay  grade  for  the  sample  GS  jobs.  Thus, 
sample  GS-7  jobs  were  evaluated  as  more  difficult  than  sample  j 

GS-S  jobs,  and  GS-J»  jobs  were  evaluated  as  more  difficult  than  > 

GS- 3  jobs. 

Although  the  job  content  in  the  various  pay  grades  of  both 
systems  was  evaluated  as  showing  increased  difficulty  at  higher 
grades,  some  evidence  of  overlap  in  job  difficulty  between  grades  I 
was  also  found  in  each  system.  This  implies  that  some  jobs 
have  the  same  content  (difficulty),  although  they  are  classified 
in  different  pay  grades.  For  example,  some  jobs  at  the  GS-7  and 
GS-S  levels  overlap  in  the  Civil  Service  system,  while  soma  jobs 
at  the  E-7  and  E-S  levels  overlap  in  the  military  service. 

The  following  relative  positioning  of  the  military  enlisted 
pay  grades  and  the  GS  pay  grades  was  found  by  inspection: 

•  The  content  of  the  sample  E-3  jobs  is  similar  to  the 
content  of  the  sample  GS-5  jobs.  The  content  of  the  E-S 
jobs  is  similar  to  the  content  of  the  GS-S  jobs.  The 
content  of  the  E-7  jobs  is  similar  to  the  content  of  the 
GS-7  jobs. 

1 

•  In  terms  of  median  job  content,  the  median  E-3  job 
lies  near  the  median  GS-3  job;  the  median  E-5  job  lies 
near  the  median  GS-S  job;  and  the  median  E-7  job  falls 
near  the  median  GS-7  job.  Note  that  the  median  job 
content  for  each  military  grade  exceeds  the  median  for 
the  corresponding  Civil  Service  grade.  Thus,  the  E-3 
median  is  91  greater  than  the  median  for  GS-3;  the  E-S 
median  is  71  greater  thar.  the  median  for  GS-5;  and  the 

E-7  median  is  S3  greater  than  the  median  for  GS-7.  « 

i 

J 

•  In  terms  of  range ,  the  E-3  and  GS-3  jobs  have  an  almost 

identical  range  of  job  content.  The  relatively  narrow  j 

range  of  CSS  jobs  falls  within  the  wider  range  of  E-S 

jobs.  The  range  of  E-7  jobs  falls  within  the  relatively 
htder  range  of  GS-7  jobs. 
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Statistical  analyses  indicated  that  linkages  exist  at 
E-3/GS-3;  E-S/GS-5;  and  E-7/GS-7.  The  1967  QBMC  had  identified 
a  linkage  of  E-3  to  GS-3.  The  1967  QRMC  also  found  a  linkage  of 
E-5  and  6S-S,  for  the  Navy  and  Marine  Corps  only.  Linkages  at 
these  levels  were  supported  by  the  results  of  the  present  study. 

A  statistical  linkage  of  E-7  to  GS-7  was  also  found  in  the  present 
study. 

A. 2.  Relationships  of  the  Wage  Grade  System  to  E-3  and  B-5 

The  ladder  in  Figure  2  shows  the  relationship  of  levels  in 
the  Civil  Service  Wage  Grade  system  and  in  the  military  grades 
of  E-3  and  E-5. 

In  terms  of  relative  job  content,  the  E-5  jobs  were  evaluated 
as  more  difficult  than  the  E-3  jobs  as  noted  previously. 

Similarly,  the  sample  WG-10  jobs  were  evaluated  as  more  difficult 
than  WG-8  jobs,  WG-8  jobs  were  evaluated  as  more  difficult  than 
WG-6  jobs,  and  WG-6  jobs  were  evaluated  as  more  difficult  than 
WG-5  jobs. 

Although  the  job  content  in  the  various  Wage  Grades  was 
evaluated  as  showing  increased  difficulty  at  higher  grades,  some 
evidence  of  overlap  in  job  content  between  pay  grades  was  noted. 
Overlap  was  found  between  jobs  at  the  WG-5  and  WG-6  levels,  and 
between  jobs  at  the  WG-8  and  WG-10  levels. 

Overlap  was  not  found  between  the  E-3  and  E-5  levels. 

The  following  relative  positioning  of  military  and  Wage 
Grade  levels  was  found  by  inspection: 

e  The  content  of  the  sample  E-3  jobs  is  similar  to  the 
content  of  the  sample  WG-5  and  WG-6  jobs.  In  terms  of 
median  job  content,  the  median  E-3  job  J.es  between  the 
median  WG-S  job  ana  median  WG-6  job.  The  E-3  median 
is  241  greater  than  the  median  for  WG-S.  In  terms  of 
range,  the  range  of  WG-6  jobs  lies  totally  within  the 
range  of  E-3  jobs.  In  contrast,  all  of  the  WG-5  jobs 
were  evaluated  below  the  median  for  the  E-3  pay  grade. 

e  The  content  of  the  sample  E-S  jobs  is  generally  similar 
to  the  content  of  WG-8  and  WG-10  jobs.  The  median  E-5 
job  falls  between  the  median  WG-8  job  and  median iG-10 
job.  The  E-5  median  is  221  greater  than  the  median  for 
WG-8.  The  range  of  E-5  jobs  is  similar  to  the  range  of 
WG-10  jobs.  Tncontrast ,  the  range  of  WG-8  jobs  falls 
at  the  lower  end  of  the  E-S  distribution. 
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Statistical  analyses  indicated  a  linkage  of  E-3  to  WG-6. 

A  statistical  linkage  of  E-S  to  WG-10  was  also  found.  The 
1967  QRMC  identified  a  linkage  of  E-3  to  WB-S  fnow  WG-5). 

The  1967  QRMC  also  found  a  linkage  of  E-S  to  WB-10  (now 
WG-10)  for  the  Navy  and  Marine  Corps,  but  not  the  Army  and 
Air  Force.  The  E-3/WG-S  linkage  found  by  the  1967  QRMC  was  not 
supported  by  the  results  of  the  present  study.  A  linkage  was 
found  at  the  E-S/WG-10  level,  although  interservice  coaparisons 
were  not  aade  in  this  study.  As  noted  above,  a  link  at  this 
level  was  found  only  for  the  Navy  and  Marine  Corps  by  the  1967 
QRMC.  The  statistical  analysis  also  showed  that  the  E-S  and  WG-8 
jobs  differ  in  content  and  that  the  E-3  and  WG-5  jobs  differ  in 
content . 

A. 3.  Relationship  Between  Pay  Grades  in  the  Wage  Supervisory 

SysTea  and  E-7 

The  ladder  in  Figure  3  shows  the  relationship  between  two 
levels  in  the  Civil  Service  Wage  Supervisory  system  and  the 
military  grade  of  E-7. 

In  terms  of  relative  job  content,  sample  E-7  jobs  have 
previouslv  V?cn  shown  to  be  more  difficult  than  sample  E-3 
and  E-S  jobs  (.»ee  Figure  1  for  details).  Similarly,  the  sample 
WS-10  jobs  were  evaluated  as  more  difficult  than  WS-9  jobs. 

Although  the  job  content  in  the  two  Wage  Supervisory  pay 
grades  was  evaluated  as  showing  increased  difficulty  in  WS-10 
compared  to  WS-9,  there  was  significant  overlap. 

The  following  relative  positioning  of  the  E-7  pay  grade 
to  the  two  Supervisory  pay  levels  was  found  by  inspection: 

•  The  content  of  the  sample  E-7  jobs  is  similar  to  the 
content  of  the  sample  WS-9  and  WS-10  jobs.  In  terms 
of  median  job  content,  the  median  E-7  job  lies  between 
the  median  WS-9  job  and  median  WS-10  job.  The  E-7  median 
was  S\  greater  than  the  WS-9  median.  In  terms  of  range, 
the  »*nge  of  E  7  jobs  lies  totally  within  the  range  of 
WS-10  jobs,  suggesting  an  equivalence  between  these 
levels.  In  contrast,  801  of  the  WS-9  jobs  were 
evaluated  below  the  median  for  the  E-7  pay  grade. 

Statistical  analyses  indicated  that  the  sample  E-7  jobs 
were  different  in  difficulty  frot.  either  the  K^-9  or  the  WS-10 
jobs.  No  statistical  linkage  was  found  of  E-7  to  WS-9,  or  E-7 
to  WS-10.  These  results  suggest  that  the  median  E-7  job  clearly 
lies  within  the  range  of  WS-9  to  WS-10. 
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B.  Officer  Level 


The  present  study  also  evaluated  the  relationship  between 
pay  grades  in  the  General  Schedule  system  and  in  the  military 
officer  grades.  Results  are  discussed  separatel>  for  the  C-l, 
0-2,  and  0-5  grades,  and  for  the  0-8  pay  grade,  relative  to 
levels  in  the  General  Schedule  (GS)  pay  structure. 

B.l.  Relationships  Between  Pay  Grades  in  the  General  Schedule 

System  and  the  6-i,  6-2  and  6- iT  grades 

The  ladder  in  Figure  4  shows  the  relationship  of  selected 
levels  in  the  Civil  Service  General  Schedule  system  and  in  the 
military  grades  of  0-1,  0-2,  and  0-5. 

In  terms  of  relative  job  content,  sample  0-5  jobs  were 
evaluated  as  much  more  difficult  than  sample  0-1  and  0-2  jobs. 
The  sample  0-2  jobs  were  evaluated  as  more  difficult  than  the 
0-1  jobs  although  the  evaluations  were  quite  similar  (See 
statistical  test  results,  below).  In  the  pay  grades  evaluated 
under  the  General  Schedule  system,  the  sample  GS-15  jobs  were 
evaluated  as  more  difficult  than  the  GS-14  jobs,  and  the  GS-9 
jobs  were  evaluated  as  more  difficult  than  the  GS-7  jobs. 

Evidence  of  overlap  in  job  difficulty  was  noted  in  both  the 
military  and  Civil  Service  systems.  Overlap  was  found  between 
0-1  and  0-2  in  the  military  service,  and  between  GS-7  and  GS-9 
as  well  as  GS-14  and  GS-15  in  the  Civil  Service. 

Overlap  was  not  observed  betweon  the  0-2  and  0-5  levels. 
This  is  to  be  expected,  given  the  differences  in  jobs  at  the 
0-5  level  compared  to  the  0-2  level. 

No  jobs  in  the  interim  officer  ranks  were  evaluated  in  this 
study.  It  is  possible  that  some  overlap  exists  between 
positions  in  adjacent  pay  grades  involving  0-2,  0-3,  0-4,  and 
0-5. 


The  following  relative  positioning  of  the  military  officer 
grades  and  the  GS  pay  grades  was  found  in  inspection: 

e  The  content  of  the  sample  0-1  jobs  is  similar  to  the 
content  of  the  sample  GS-/  and  GS-9  jobs.  The 
content  of  sample  0-2  jabs  is  similar  to  the  content 
of  sample  GS-9  jobs.  The  content  of  the  sample  0-5  jobs 
is  similar  to  thi  conteu .  of  the  sample  GS-14  jobs  and 
some  of  the  GS-15  jobs. 
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•  In  terms  of  median  job  content,  the  median  0-1  job  lies 
between  the  iidrSn  GS-7  job  and  median  GS-9  job.  The 
0-1  median  is  164  greater  than  the  GS-7  median.  The 
median  0-2  job  falls  above  the  median  GS-9  job.  The 
0-2  median  is  44  greater  than  the  GS-9  median.  Tio 
median  0-5  job  falls  between  the  mediau  GS-14  job  and 
median  GS-15  job.  The  0-S  median  is  1U4  greater  than 
the  GS-14  median. 

•  With  respect  to  range.  0-1  jobs  do  not  fall  exclusively 
into  the  range  of  either  GS-7  or  GS^TTf  instead  0-1  jobs 
fall  in  a  range  bounded  by  the  two  ranges  defined  by  the 
GS-7  and  GS-9  job  evaluations.  In  contrast,  the  range 
for  0-2  and  GS-9  are  very  similar.  GS-14  jobs  fall 
totally  within  the  relatively  wide  range  of  0-5  jobs. 

Statistical  analyses  indicated  linkages  of  GS-9  to  0-1,  and 
GS-9  to  0-2.  The  1967  QRMC  found  linkage  of  0-1  to  GS-7.  This 
linkage  wr  s  not  supported  by  the  results  of  the  present  study. 

The  statistical  analysis  indicated  that  the  sample  GS-7  positions 
had  less  co  itent  than  the  sample  0-1  positions. 

The  finding  that  GS-9  could  be  linked  to  either  0-1  or  0-2 
is  explained  by  the  additional  finding  that  the  job  content  of 
0-1  and  0-2  is  quite  similar.  In  two  of  the  three  tests  performed 
on  the  data,  there  was  no  evidence  of  a  statistically  significant 
difference  between  0-1  and  0-2  evaluations,  (As  noted  earlier, 
there  was  considerable  overlap  in  the  01  and  0-2  job  evcluations. 
Their  median  job  content  was  also  similar;  the  intergradr* 
differential  in  median  job  content  was  only  164).  This  finding 
indicates  one  problem  of  attempts  to  develop  exact  linkages 
between  classification  systems  with  different  grade  structures. 

Statistical  analysis  indicated  the  equivalence  of  0-5  and 
GS-14  jobs.  A  linkage  of  0-5  to  GS-14  was  found. 

Statistical  analysis  also  indicated  that  the  sample  GS-15 
•jositions  had  more  job  content  than  the  sample-  0-5  positions. 

:i.2.  Relationships  Between  the  General  Schedule  System  and  the 
i^OrrCradc - 

The  ladder  in  Figure  5  shows  the  relationship  of  selected 
levels  in  the  Civil  Service  General  Schedule  system  to  the 
military  pay  grade  or  C-8.  (The  valuej  of  0-5,  GS-M  and  GS-15 
from  Figure  4  arc  reproduced  in  Figure  S  for  convenient  reference), 
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The  following  relative  positioning  of  the  Military  0-8  and 
the  GS- 18  pay  grade  was  found  by  inspection: 

•  The  content  of  the  sample  0-8  jobs  is  similar  to  the 
content  of  the  sample  GS-13  jobs*  to  some  extent.  The 
More  difficult  GS-18  jobs  are  as  difficult  as  some  0-8 
jobs.  However,  the  less  difficult  GS-18  jobs  were  evalu¬ 
ated  as  far  less  difficult  than  any  of  the  0-8  jobs. 

e  In  terns  of  Median  job  content,  the  Median  0-8  job  was 
evaluated  as' Much  more  difficult  than  the  median  GS-18 
job.  The  0-8  median  is  57%  greater  than  the  GS-18  median. 

e  In  terns  of  range*  there  was  overlap  between  the  range 
of  0*8  positions  and  GS-18  positions.  The  very  wide 
range  of  difficulty  in  the  sample  GS-18  jobs  contributed 
to  this  finding. 

Statistical  analyses  also  indicated  that  the  sample  0-8 
jobs  were  different  in  difficulty  from  the  GS-18  positions.  The 
1967  QRMC  found  a  linkage  of  0-8  to  GS-18.  This  linkage  was 
not  supported  by  the  results  of  the  present  study.  The  statistical 
analyses  indicated  that  the  sample  GS-18  positions  have  less  job 
content  (diff iculty) ,  in  terms  of  median  value*  than  do  the  sample 
0-8  positions.  The  results  suggest  that  GS-18  might  be  a 
lower-bound  or  limit  with  respect  to  0-8. 

Discussion 


This  study  indicated  the  feasibility  of  evaluating  both 
military  and  Federal  Civil  Service  occupations  using  a  point- 
factor  job  evaluation  aperoac/  (the  Hay  Method).  Comparisons 
of  job  content  between  pay  grades  in  the  military  and  Civil 
Service  could  be  made,  since  jobs  were  evaluated  on  a  common 
standard.  In  addition,  the  data  were  anlayzed  to  determine 
overlap  in  job  difficulty  within  each  classification  system. 

Further,  it  was  possible  to  analyze  "pure”  military  job*  and 
compare  their  difficulty  to  military  jobs  which  have  civilian 
counterparts.  Indeed,  a  significant  finding  of  this  study  was 
the  fact  that  in  the  enlisted  grades  and  in  the  company  grade 
officer  ranks,  the  content  of  "pure"  military  jobs  did  not  differ 
systematically  from  the  content  of  military  jobs  with  Civilian 
counterparts. 

Implications  of  th >  Findings 

In  contrast  to  previous  efforts,  this  study  employed  a 
standard  statistical  criterion  to  determine  if  statistical 
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linkages  existed.  This  approach  was  more  stringent  than  the 
criteria  applied  in  previous  studies  in  that  the  job  content 
in  two  pay  grades  had  to  be  quite  similar  before  they  could  be 
considered  statistically  linked.* 

By  applying  this  statistical  criterion  of  linkage,  the  present 
study  failed  to  verify  some  of  the  linkages  identified  by  the 
1967  QRMC,  and  used  in  previous  pay  comparability  analyses. 

Thus,  the  present  study  did  not  find  statistical  linkages  at 
these  levels: 

•  0-8  and  GS-18 

•  0-1  and  GS*  7 

•  E*3  and  KC-S 

Statistical  linkages  of  only  E-5  and  WG-10,  E-5  and  GS-5,  and 
E-3  and  GS-3  were  found.  Thus,  only  half  of  the  linkages  employed 
by  tho  1967  QRMC  were  supported  by  the  results  of  the  present 
research.  Note  that  the  1967  QRMC  study  did  not  employ  this 
stringent  statistical  criterion  of  a  linkage.  Hence,  some 
differences  in  the  results  are  not  surprising. 


SUMMARY  OF  LINKAGE  EVALUATIONS 


Sov  rce 

Proposed 

Lin^aje 

Statistical 

Linkages 

Found 

O-8/GS- 18 

NO 

O-l/GS-7 

NO 

1967 

E-5/GS-6 

YES 

QRMC 

E-S/WG-10 

YES 

E-3/CS-3 

YES 

1 

E-  3/WG-5 

NO 

l  In  this  study,  a  linkage  required  the  demonstration  that  the 
distribution  of  job  content  in  two  sampler,  (one  military  grade, 
one  Civil  Service  grade)  was  similar.  Tests  were  made  to  see 
if  tne  results  differed  by  less  than  ?  predefined  minimum 
.tr'ount.  Only  when  this  condition  was  satisfied  was  a 
statistical  linkage  said  to  exist. 
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The  present  study  also  evaluated  proposed  linkages  at 
levels  which  were  not  studied  by  the  1967  QRMC.  The 
findings  support  the  possibility  of  a  statistical  link  between 
0-5  and  GS-14,  between  0-2  (or  0-i)  and  GS-9,  and  between 
E-7  and  GS-7.  A  statistical  linkage  of  E-5  and  WG-6  was  also 
found. 

In  total,  only  half  of  the  linkages  hypothesized  by  the 
1967  QRMC  were  substantiated  by  th*  results  of  this  study. 

In  spite  of  these  findings,  the  need  still  remained  for 
the  1975  QRMC  to  establish  the  comparability  of  job  content 
between  pay  grades  in  the  military  and  Federal  Civil  Service 
classification  systems. 


Cognizant  of  the  problems  associated  with  attempts  to 
define  precise  linkages  between  tho  military  service  and  the 
Federal  Civil  Service,  the  thrust  of  this  research  was  redirected 
toward  attempts  to  determine  the  relative  position  of  military 
pay  grades  with  respect  to  Civil  Servlce’pay  grades ,  instead  of 
trying  to  link  military  and  Civil  Service  pay  grades  on  a  one- 
for-onc  basis. 

The  relative  postioning  approach  was  less  stringent  than 
the  linkage  approach,  in  the  linkage  approach,  the  objective 
was  to  find  two  pay  grades  quite  similar  in  job  content.  In 
contrast,  the  relative  positioning  approach  simplv  required  that 
Civil  Service  pay  grades  be  found  whose  median  job  content  was 
above  or  below  the  median  job  content  of  the  military  pay  grade 
in  question.  Such  Civil  Service  pay  grades  would  then  brackot 
the  military  grade  and  provide  the  basis  for  locating  the  military 
grade  with  respect  to  the  two  Civil  Service  grades.  Thus,  one 
could  position  the  nedian  job  content  of  the  military  grade 
as  some  fraction  of  the  distance  between  the  median  job  content  of 
the  two  Civil  Service  grades  which  brackot  the  military  grade. 

The  relative  positioning  approach  was  successfully  applied 
at  s*z  uilitary  pay  grades:  E-5;  E-5;  E-7;  0-1;  0-2;  and  0-5. 


Results  are  presented  below  for  the  enlisted  grades  and 
the  General  Schedule  (white-collar)  pay  grades: 


RELATIONSHIPS  OF  SELECTED  ENLISTED  PAY  GRADES 
TO  U.S.  CIVIL  SERVICE  GENERAL  SCHEDULE  PAY  GRADES 


Levi 

Relationship  o'  Median 

Hay  Point  Values 

Statistical 

Linkage 

Findings 

E-7 

Compared  to  GS-7  and 

GS-9,  the  E-7  'median 
ia  18 of  the  way  from 

GS-7  to  GS-*. 

E-7  and 

GS-7  Unit. 

E-  $ 

Compared  to  CS-S  and 

CS-7,  th«  E-  S  median 
i»  2<«f»  of  the  way  from 

CS-S  to  CS-7. 

E-S and 

GS-5  Uek. 

E-3 

Compared  to  CS- 3  and 

CS-S.  the  E-3  median 
i»  1 1*".  of  the  wa,-  from 

GS- 1  to  G5-  S. 

E-3  and 

GS  -  3  Ur.k. 

The 

for 

for 


we 

as 

pa 


h 

et 

es 


The  relationships  of  the  military  enlisted  grades  to 
Civil  Service  blue  collar  pay  grades  were  also  determined. 


RE U  TIONSHIPS  OF  SELECTED 
ENLISTED  PAY  GRADES  TO  U.S.  CIVIL 
SERVICE  WAGE  GRADE/WAGE  SUPERVISORY  PAY  GRADES 


Level 


E-7 


E-  5 


E-  3 


Relationship  of  Median 
Hay  Point  Values 

Compared  to  WS-9  and 
WS-10,  the  E-7  mew'*n  is 
247»  ef  the  way  from  WS-9 
to  WS-10. 


Statistical 


WS-10  is  at-, 
upper  bound 
on  E-7;  WS-9 
is  a  lower 
bound. 


Compared  to  WG-8  and 
WG-10,  the  E-5  median  is 
847j  of  the  way  frorr.  WG-8 
to  WG-10. 


E-5  and  WG-10 
are  a  statistical 
link.  WG-8  in  a 
lower  bound  on 
E-5. 


Compared  tc  WG-5  and  WG-6, 
the  E-3  median  if  T9%  of  the 
way  from  WG-5  to  WG-5, 


E-3  and  WG-6 
are  a  statistical 
link.  WG-8  is 
an  upper  bound 
on  E-!:  WG-5  is 
a  lower  bonnd. 


The  median  job  content  of  E-3,  E-5,  and  E-7  was  bracketed  by  pay 
grades  in  the  W’ age  Grade  and  Wage  Supervisory  classification 
systems . 

The  relationship  of  military  officer  grades  to  pay  graues 
in  the  Civil  Service  General  Schedule  was  also  examined. 


RELATIONSHIPS  OF  SELECTED 
OFFICER  PAY  GRADES  TO  U.S.  CIVIL 
SERVICE  GENERAL  SCHEDULE  PAY  GRADES 


Level 

Relationship  of  Median 

Hay  Point  Values 

Statistical 

Findings 

0-8 

The  0-8  median  exceeds 
the  GS-18  median  by  57%. 

GS-18  is  a 
lower  bound 
on  0-8. 

0-5 

Compared,  to  GS-14  and 

G3-1S,  the  0-5  median 
is  24%  of  the  way  from 

GS-14  to  GS-15. 

0-5  and  GS-14 
are  a  statistical 
link;  GS-15  is 
an  upper  bound 
on  0-5. 

0-2 

Compared  to  GS-9  and 

GS-11'*,  the  0-2  median 
is  14%  of  the  way  from 

GS-9  to  GS-11. 

0-2  and  GS-9 
are  a  statistical 
link. 

0-1 

Compared  to  GS-7  and 

GS-9,  the  0-1  median  is 

54%  of  the  way  from  GS-7 
to  GS-9. 

GS-7  is  a  lower 
bound  on  0-1. 
GS-9  and  0-1 
are  a  statistical 
link . 

The  GS- 
study. 

11  median  value  "was  taken  from  research  conduc 

ted  in  another 

1  No  other  GS-li  estimate  exists  for  comparison  purposes.  How¬ 
ever,  caution  is  recommended  in  the  use  of  this  value  because 
the  sample  size  is  small  (N-S).  Further,  different  criteria 
were  established  in  each  study  for  the  representativeness  of 
the  PATCH  categories,  and  different  population  frames  were  used. 
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It  was  possible  to  bracket  the  military  grades  of  0*1,  0-2 
and  0-S  using  the  General  Schedule  pay  grades  noted  in  the 
table. 

These  relationships  may  be  used  for  estimating  grade 
comparability  between  the  military  service  and  the  Federal 
Civil  Service. 

Summary 

In  summary,  a  limited  number  of  whole  grade  linkages 
are  identifiable  between  military  and  Civil  Service  pay 
grades.  However,  for  many  military  and  Civil  Service  grades 
the  identification  of  whole  grade  linkages  is  not  possible. 
Therefore,  the  relative  position  of  pay  grades  in  the  two 
systems  is  important  to  consider. 

The  application  of  the  point-factor  job  evaluation 
methodology  employed  in  this  research  permits  the  identifi¬ 
cation  of  both  the  linkage  o;  whole  military  and  Civil 
Service  grades  and  the  relative  positioning  of  military  and 
Civil  Service  grades. 

Finally,  even  though  it  was  not  a  part  of  the  research 
effort,  it  should  be  noted  that  this  job  evaluation  approach 
permits  the  linkage  of  military  jobs  to  private  sector  jobs. 
Virtually  hundreds  of  private  sector  firms  employ  this 
job  evaluation  approach  and  an  extensive  private  sector  job 
evaluation  and  salary  data  base  exists. 
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WHITE-COLLAR  JOB  EVALUATION  AMD  PAY  SYSTEMS 


By;  Roseawry  Storm 

CURRENT  INITIATIVES 


In  the  Federal  civilian  white-collar  world,  job  evaluation  and  pay  are 
inexorably  linked.  This  paper  will  set  forth  the  existing  statutory 
framework,  som  relatively  recent  history  on  pay  comparability,  and 
current  initiatives  for  improvement. 

JOB  EVALUATION 

Job  evolution  in  the  Federal  Government  la  probably  the  moat  sophis¬ 
ticated,  detailed  and  studied  in  the  world.  A  wido  range  of  occupations 
Is  required  to  accomplish  varied  Federal  activities.  Under  law,  there 
must  be  equal  pay  for  substantially  equal  work  across  the  many  agency 
lines  as  well  as  across  occupational  lines. 

Over  1.3  million  white-collar  positions  are  classified  under  the  18- 
grade  General  Schedule.  The  18  grade  levels  and  their  definitions  are 
set  in  law.  Authority  to  classify  positions  is  vested  in  the  head  of 
each  agency,  and  positions  must  be  classified  in  accordance  with  pub¬ 
lished  standard''  Issued  by  the  Civil  Service  Commission. 

In  1977,  Federal  agencies  began  implementation  of  the  new  Factor  Evalu¬ 
ation  System— an  improved,  standardised,  factor/point  methodology  for 
classifying  non- supervisory  positions  in  grades  GS-1  through  GS-15 — 
after  several  yearn  of  CSC  development,  testing,  modification  and  re¬ 
testing.  Key  elements  of  the  new  Factor  Evaluation  System  include  a 
common  set  of  nine  factors,  defined  degrees  of  each  factor,  benchmark 
descriptions  of  representative  positions  at  various  grades,  and  a  con¬ 
version  chart  that  translates  total  points  Into  GS  grades. 

The  overall  reaction  to  the  Factor  Evaluation  System  is  clearly  favor¬ 
able.  It  is  relatively  simple  to  apply  and  is  easily  understood  by 
classifiers,  managers,  and  employees.  Most  commentators  agree  that 
it  provides  accurate  grades.  Improved  alignment  across  occupational 
and  agency  lines,  and  better  documentation  of  classification  decisions. 

An  accelerated  standards  production  program  is  underway,  uaing  the  new 
methodology.  In  FY  1977,  the  Civil  Service  Commission  Issued  initial 
standards  covering  52,000  positions.  Plana  are  to  complete  a  basic 
set  of  new  standards  within  five  years. 
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PAY 


The  Federal  pay  comparability  process  was  developed  in  the  late  1950* s 
and  early  I960'*.  It  involved  four  years  of  data  gathering  by  the 
Bureau  of  Labor  Statistics  (1958-1962),  culminating  in  President 
Kennedy's  urging  salary  reform  and  the  coaparabll&ty  principle,  and 
the  Congress'  enacting  the  Federal  Salary  Reform  Act  of  1962.  At  the 
time  of  enactment.  Federal  pay  was  so  far  below  comparability  that 
compar isons  could  be  made  lit  rather  gross  terms. 

In  1969  full  comparability  was  reached,  and  more  precipe  measurement  be¬ 
came  crucial.  There  were  also  changes  in  the  national  labor  market  and 
in  the  Federal  work  force  that  called  into  question  some  aspects  of 
white-collar  pay  comparisons. 

In  1973,  a  General  Accounting  Office  report  pointed  out  the  following 
needs  for  improvements:  (1)  more  emphasis  on  pay  research;  (2)  broader 
coverage  of  the  PATC  survey  (this  is  the  Bureau  of  Labor  Statistics 
survey  of  professional,  administrative,  technical,  and  clerical  Jobs 
which  is  used  as  a  data  base  to  set  Federal  white* collar  pay);  and 
(3)  broader  industrial  scope  of  the  PATC  survey.  Then,  too,  other 
criticisms  began  to  mount  from  both  inside  and  outside  Government. 

These  criticisms  generally  said  that  pay  for  some  jobs  was  too  high, 
and  pay  for  some  other  Jobs  was  too  low.  In  1974,  the  Civil  Service 
Commission  undertook  research  projects  in  two  major  categories;  (1)  im¬ 
proving  the  present  system,  and  (2)  exploration  of  other  methods  to  get 
to  closer  pay  comparability. 

It  was  against  this  background  that  President  Ford,  in  his  FY  1976 
budget,  announced  plans  to  establish  a  blue  ribbon  panel  to  make  policy 
recomawndatlons  to  him  on  Federal  pay.  Vice  President  Rockefeller  was 
appointed  to  head  thia  panel. 

The  overall  theme  of  the  recommendations  was  support  for  the  compara¬ 
bility  principle  to  set  pay.  The  specific  recommendations  were  of  two 
kinds:  Those  that  could  be  lmpleamtnted  without  legislation  (a.g. ,  ire- 
proved  statistical  techniques)  and  those  that  would  require  legislation. 
The  major  recommendations  that  would  require  legislation  include: 

-  splitting  the  current  monolithic  General  Schedule  into  two 
basic  Services:  a  Clerical  and  Technical  Service  with 
local  pay  schedules  and  a  Professional  and  Administrative 
Service  with  national  pay  schedules; 

-  authorising  special  occupational  services  when  the  regular 
service  hampers  usnageswnt's  ability  to  recruit,  retain,  or 
manage  a  well-qualified  work  force; 
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-  authorizing  use  of  State  and  local  government  pay  data 
in  Federal  pity  surveys; 

-  conducting  major  pay  surveys  less  frequently  than  annually 
(but  using  a  statistical  indicator  to  adjust  pay  in  inter- 
vening  years); 

-  combining  or  eliminating  separate  Federal  civilian  pay 
system  as  needed; 

-  studying  and  developing  a  pay  advancement  system  for  pro* 
fessional  and  administrative  employees  based  on  quality  of 
performance;  and 

*  developing  and  testing  methodologies  for  extending  the 
principle  of  comparability  to  benefits  as  veil  as  pay 
(total  compensation  comparability). 

Methodologies  for  cchievlng  a  total  compensation  comparability  system 
are  nov  being  tested.  The  Bureau  of  Labor  Statistics  has  conducted 
successful  preliminary  teste  of  benefits  data  tcllectlon.  Expanded 
testing  is  planned  for  FY  78.  This  is  a  project  of  tremendous  technical 
difficulty.  It  has  stirred  considerable  interest  inside  and  outside 
Governswnt,  and  ve  look  forward  to  viewing  the  results  of  the  further 
testing  and  developsuntul  work. 

On  Kay  27,  1977,  President  Carter  established  a  Federal  Personnel 
Management  Project  to  make  a  top-to-bottom  study  of  Federal  personnel 
management  as  a  pert  of  his  leorganlzatlon  effort.  A  report  to  the 
President  Is  due  in  November  1977.  A  legislative  proposal  to  accomplish 
those  recomaendatlors  submitted  by  the  1975  President's  Panel  on  Federal 
Compensation  was  sent  to  the  Personnel  Management  Project  for  review 
and  cons td*rat Ion.  As  a  part  of  this  project,  just  four  short  days  ago, 
an  options  paper  on  job  evaluation,  pay  and  benefits  was  issued  to  the 
public.  Eight  issues  are  discussed,  with  various  options  for  each 
issue.  The  Issues  are: 

1.  ShouH  the  Government  extend  its  pay  comparability  policy 
to  Include  benefits  as  well  as  pay  (total  compensation 
comparability)? 

2.  What  methods  of  measurement  should  be  used  in  comparing 
Federal  benefits  with  non-Federal  benefits? 

3.  Should  there  be  central  authority  for  granting  benefit 
changes  to  Federal  employees?  If  so.  where  should  it 
reside? 

4.  To  bring  about  closer  pay  comparability  with  other  em¬ 
ployers. 
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(a)  Should  the  General  Schedule  be  divided  into  two  or 
■ore  hoaogeneoua  occupational  groupings  with  separate 
classification  and  pay  systems? 

(b)  Should  soae  or  all  of  the  General  Schedule  work  force 
be  paid  on  the  basis  of  local  rotes? 

(c)  Should  the  President  be  authorised  to  establish  and 
abolish  special  pay  systems  for  specific  occupations 
or  groups  of  occupations? 

5.  Should  comparisons  include  State  and  local  government  em¬ 
ployees  for  purposes  of  establishing  comparability? 

6.  Can  the  principle  of  merit  pay  be  used  to  improve  and  reward 
employes  performance? 

7.  What  improvements  are  needed  in  the  job  evaluation  process? 

8.  What  should  be  done  about  the  changing  relationship  between 
blue-collar  and  white-collar  pay  rates? 

Views  from  interested  parties  will  be  considered  and  recommendations 
formulated  for  transmittal  to  the  President  in  Kovember — and  for  possi¬ 
ble  inclusion  in  his  budget  message  in  January  1978. 


CERTIFICATION  &  LICENSURE  PROGRAMS  FOR 
OCCUPATIONAL  SKILL  DOCUMENTATION 


Roger  6.  Goldberg 
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The  Role  of  the  Defense  Activity  for  Non 'Traditional  Education  Support 
(DANTES)  In  the  Provision  of  Skill  Documentation  Programs 

The  Defence  Activity  for  Non-Tradltlonal  Education  Support  (DANTES) 
was  established  by  Congress  In  June  1974  to  provide  Independent  study  and 
examination  programs  for  military  personnel. 

Through  memoranda  of  understanding  executed  with  approximately  65 
regionally  accredited  colleges  and  universities,  we  publish  a  catalog  and 
provide  descriptive  information  on  over  10,300  inde* indent  study  courses 
from  the  high  school  and  undergraduate  level  through  graduate  study  <nd 
professional  continuing  education.  The  availability  of  this  variety  of 
educational  programs  Is  designed  so  that  military  personnel  may  be  able  to 
continue  to  pursue  their  educational  objectives  no  mitter  how  Isolated  or 
remote  the  duty  station  and  Inaccessible  resident  study. 

Through  contractual  agreements  with  organisations  such  as  the  Educa¬ 
tional  Testing  Service,  College  Entrance  Examination  Board,  American 
Council  on  Educetlon  and  American  College  Testing  we  provide  examination 
programs  such  as  the  GED  High  School  Equivalency  Examinations,  the  College 
level  Examination  Program  (CLEP),  the  DANTES  Subject  Standardised  Tests 
(DSST's),  Graduate  Record  Examinations  and  many  others.  Examinations  that 
provide  credit  for  high  school  course  equivalency,  high  school  equivalency 
certificates,  college  entrance  and  credit  and  graduate  admissions. 

Through  approximately  880  testing  sites  located  throughout  the  world 
over  275,000  examinations  are  administered  yearly. 

In  the  college  level  credit  examination  programs,  the  CLEP  and  DSST's, 
191,000  examinations  were  administered  through  August  of  this  year.  Approx¬ 
imately  101,000  Individuals  passed  the  examinations  with  scores,  at  or 
above  the  level,  established  by  the  American  Council  on  Education  for  tha 
awarding  of  acadanic  credit.  These  examinations  provides  •  potential  of 
302,700  credit  hours  at  a  cost  of  $2.87  per  credit  hour.  This  comperes 
with  the  civilian  administered  examination  program  cost  of  approximately 
$7.00  per  credit  hour  or  the  Inservice  tuition  asilstance  cost  of  approx¬ 
imately  $25,00  per  credit  hour.  We  estimate,  based  on  these  two  examin¬ 
ation  programs  alone,  a  potential  cost  savings/cost  avoidance  to  000  of 
over  $5,7j0,000. 

in  addition  to  the  academically  oriented  examination  programs,  DANTES 
has  been  expanding  both  tha  role  and  number  of  vocational -technical  and 
pora-professlonal  examination  programs  within  000.  We  view  our  activities 
In  this  area  as  supportive  of  the  various  service  sponsored  job  recognl- 
ttoa/sklll  documentation  programs  and  antlclpata  significant  grorth  in  this 
program  area. 

Capitalizing  on  our  unique  role  as  the  provider  of  examination  pro¬ 
grams  for  the  Department  of  Defense,  we  have  developed  memoranda  of  under¬ 
standing  with  a  m*ber  of  nationally  recognized  certification  organizations 
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allowing  for  DANTES  Test  Control  Office**,  world-wide,  to  administer  tech¬ 
nical  and  professional  certification  examinations. 

To  date,  tha  'oi lowing  organizations  have  signed  memoranda  with  OANTES 
and  their  certification  by  examination  program  are  being  administered  by 
DANTES  Test  Control  Officers  at  military  Installations. 

American  Association  of  Nodical  Assistants 

Certified  Medical  Assistant 

Basic 

Administrative 

Clinical 

American  Medical  Technologists 

Medical  Technologist 
Medical  Laboratory  Technicians 
Registered  Medical  Assistant 

institute  for  tha  Cartlflcation  of  Engineering  Technicians 

Certified  Associate  Engineering  Technician 
Certified  Engineering  Technician 
Certified  Senior  Engineering  Technician 

Architectural  &  Building  Construction  Technology 

Civil  Engineering  Technology 

Electronics  Engineering  Technology 

Fluid  Power  Engineering  Technology 

Industrie!  Engineering  Technology 

Mechanical  Engineering  Technology 

Metallurgical  Engineering  Technology 

Geotechnical  Engineering  Technology 

Constructions  Materials  Tasting 

Electrical  Power  (Production/Transmission/Sub-Station 

Distribution) 

Natlonai  Institute  for  Automotive  Service  Excellence 
Certified  General  Automobile  Mechanic 


Engine  Repair 

Automatic  Transmission 

Manual  Transmission  and  Rear  Axle 

Front  End 

Brakes 

Electrlcel  Systems 
Heating  and  Air  Conditioning 
Engine  Tune-Up 


Certified  Heavy-Duty  Truck  Mechanic 

Gasoline  Engines 
Diesel  Engines 
Drive  Train 
Brakes 

Suspension  and  Steering 
Electrical  System 

Certified  Body  Repairer 
Certified  Painter  ft  Refinisher 

Institute  for  Certification  of  Computer  Professionals 

Certified  Data  Processor 
Certified  Computer  Programmer 

Business  Programing 
Scientific  Programing 
System  Programing 

International  Society  for  Clinical  Laboratory  Technology 

Registered  Medical  Technologist 
Registered  Laboratory  Technician 

National  Registry  of  Emergency  Medical  Technicians 

Registered  Emergency  Medical  Technician 

Ambulance 
Non- Ambulance 
Paramedic 


Institute  of  Certified  Professional  Managers 


Certified  Manager 

In  addition  to  the  above  organizations,  agreements  have  been  effected 
or  are  being  negotiated  with  the  American  Society  for  Quality  Control,  the 
American  Registry  of  Radiologic  Technologists,  the  Acadmay  of  Certified 
Social  Workers  and  with  approximately  thirty-five  states  using  a  uniform 
examination  for  the  licensing  of  real  estate  salespersons  and  brokers. 

The  preponderance  of  state  licensing  examination  programs  are  highly 
state-specific.  In  term  of  the  various  requirements  for  licensing,  the 
design  of  the  test  Instrument  and  In  the  jurisdictional  acceptance  of 
the  examination.  The  lack  of  formal (zed  reciprocity  agreements  and/or 
comity  amongst  the  states  In  regard  to  state  licensing  examinations  has 


United  the  utility  of  licensing  examination  program  for  DANTES  admin- 
1  strati  on.  The  relatively  uniform  examination  procedures  employed  In 
the  field  of  real  estate  licensing  adapt  mil  to  our  testing  program 
and  hopefully  additional  uniform  licensing  examinations  will  be  Identi¬ 
fied  and  Incorporated  Into  our  testing  program  In  the  future. 

In  the  meantime*  however,  we  are  attempting  to  provide  the  Armed 
Forces  Education  Centers  with  Information  on  state  licensing  programs. 
Recently*  Florida  State  University*  under  a  DANTES  contract,  completed 
a  survey  of  the  agencies  within  every  state  that  had  proponency  for  the 
regulation  and  licensing  of  occupations.  The  results  of  the  survey  will 
be  published  as  the  Directory  of  Selected  Licensable  Occupations  and 
will  be  distributed  to  all  Armed  Forces  Education  centers  world-wide. 

Over  thirty  occupations  will  be  represented  In  the  Directory  and  Informa¬ 
tion  provided  on  age*  experiential*  education  and  residency  requirements* 
examination  and  licensing  fees*  dates  and  sites  of  examination  adminis¬ 
trations.  retest  policy  and  procedures,  reciprocity  and  comity  agreements 
and  contact  points  for  further  Information. 

The  Directory  represents  a  first  time  effort  to  gather  this  type  of 
Information  on  a  systematic  and  nationwide  basis.  In  the  early  research 
phase  of  the  project  the  Department  of  Labor  was  contacted  and  Informed 
of  our  Intentions  and  has  since  expressed  keen  Interest  In  providing  fund¬ 
ing  for  this  effort  on  a  continuing  basis. 

This  past  year  DANTES  conducted  a  pilot  project  regarding  the  use  of 
the  National  Occupational  Competency  Testing  Institute  (NOCTI)  Examina¬ 
tions.  Originally  developed  under  an  HEN  contract*  the  NOCTI  Examina¬ 
tions  are  designed  to  certify  as  vocational-technical  Instructors  those 
Individuals  with  many  year*  of  on-the-job  experience,  but  little  formal 
education. 

The  NOCTI  Examinations  are  available  In  a  ranter  of  vocational- 
technical  skill  areas  and  consist  of  a  written  multiple-choice  examina¬ 
tion  and  a  practical  demonstration  of  skills  examination  supervised  and 
evaluated  by  a  certified  NOCTI  evaluator. 

The  NOCTI  Examinations  were  administered  to  159  military  examinees 
across  the  country  and  the  scores  provided  to  the  Los  Angeles  City  Colleges 
and  the  City  Collegas  of  Chicago  for  credit  evaluation.  Credit  recommend¬ 
ations  from  the  City  Colleges  of  Chicago  ranged  from  0-36  semester  credit 
hours  with  the  mean  recommendation  of  15  hours. 

During  this  fiscal  year  DANTES  will  be  exploring  with  the  Department 
of  Labor  and  the  Individual  services,  the  adaptability  of  the  NOCTI  Exam¬ 
inations  for  credit  towards  the  formalized  educational  requirements  of 
the  service  sponsored  apprenticeship  programs. 

Each  apprenticeship  program  registered  with  the  Department  of  Lator 
usually  has  a  formal  training  requirement  of  144  hours  of  Instruction  per 
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apprenticeship  year.  The  apprenticeship  programs  registered  by  the  mili¬ 
tary  to  date*  have  achieved  recognition  by  the  Department  of  Labor,  of 
service  school  technical  training  as  fully  meting  program  requirements. 
However,  this  aspect  of  the  apprenticeship  program  would  exclude  Indi¬ 
viduals  who  have  not  attended  or  completed  a  service  school  and  who  gained 
their  skill  proficiencies  through  on-the-job  training  and  experience.  Me 
are  hopeful  that  the  Department  of  Labor  will  allow  for  the  use  of  the 
NQCTI  Examinations  In  lieu  of  formalized  training  so  that  all  qualified 
military  personnel  may  participate  In  the  various  apprenticeship  program. 

Acceptance  by  the  Department  of  Labor  of  the  NQCTI  Examinations  would 
provide  the  NOCTl  Exam  with  a  potential  dual  capability;  apprenticeship 
credit  and  academic  credit.  This  dual  capability  would  further  expand 
career  enhancement  opportunities  for  service  personnel  as  well  es  comple¬ 
ment  existing  program  for  obtaining  academic  credit.  Military  personnel 
may  already  earn  academic  credit  for  their  NOS  (Military  Occupational 
Skill)  or  NEC  (Naval  Enlisted  Code),  as  well  as  for  the  courses  taken  In 
service  schools.  With  the  combination  of  program  currently  available, 
military  personnel  often  find  that  after  several  years  of  service  an 
Associate  degree  or  Baccalaureate  degree  Is  within  reach. 

If  the  Office  on  Educational  Credit  of  the  American  Council  on  Educa¬ 
tion  can  successfully  establish  academic  credit  recommendations  for  » ^pren¬ 
tices  hip  programs  and  certification  examinations,  as  they  are  now  inves¬ 
tigating,  military  personnel  will  benefit  from  Increased  opportunities  for 
academic  achievement,  peer  recognition  and  for  greater  meaningful  alter¬ 
natives  for  personal  growth  and  career  enhancement. 


submitted  by:  Roger  G.  Goldberg 

Head,  Certification  &  Licensure 
Program  Branch 

Defense  Activity  for  Non-Tradl- 
ttonel  Education  Support 
Ellyson  Center 
Pensacola,  Florida  32509 
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DANTES  Discussion  Panel 


NATIONAL  APPRENTICESHIP  PROGRAM 


Proqrae  Manager:  Ken  Telller 


CONCEPT/  To  provide  *  brief  description  of  the  National  Apprenticeship 
~PUHPOSF  Program  fot  active-duty  personnel. 

On  24  March  1976,  the  Secretary  of  Labor  and  the  Secretary  of 
the  Navy  signed  an  agreement  which  would  permit  active-duty 
Navy  personnel  to  complete  apprenticeships  in  civilian  trader. 

In  accordance  with  this  agreement,  the  Bureau  of  Apprentice¬ 
ship  and  Training  of  the  Dep-t taunt  of  Labor  recognises  certain 
Navy  skills  as  civilian  "epprcnticeable  occupations."  Navy 
persona  who  achieve  docuaented  levels  of  experience  end  training 
in  these  skills  are  recognised  by  the  Department  of  Labor. 

The  Navy,  on  the  other  hand,  agreea  to  civilian  standards  of 
training  and  experience  and  administers  tha  program  In  a  nanner 
acceptable  to  civilian  industry. 

STATUS/  CHET  has  received  approval  fron  the  Departnent  of  Labor  to 

DISCUSSION  register  and  administer  apprenticeships  for  five  Nevy  skillet 

Office  Machine*  Mechanic 
Watch  "lock  Repairer 
Commercial  Photographer 
Camera  Repairer 
Motel  and  Restaurant  Cook 

As  of  IS  September  1977,  123  personnel  in  the  IM,  PH,  end  MS 
ratings  have  been  registered  in  the  five  trades  listed  above 
with  tha  Departaent  of  Labor.  To  qualify  for  registration, 
the  Wavy  person  taunt  have  completed  froe  288  to  432  hours 
of  formal  instruction  in  htu  trade  (usually  completed  at  an 
applicable  Class  A  or  Clam*  C  school).  TO  quality  for  the 
completion  certificate  awarded  by  the  Department  of  labor, 
the  regi: .cored  apprentice  must  couplets  from  4009  to  6000 
hours  of  documented  work  experience  in  the  trade. 

% 

There  is  little  likelihood  that  a  Navy  person  can  complete 
an  apprenticeship  within  a  four-year  enlistment.  Progress 
in  an  apprenticeship  goes  hand-in-hand  with  advancement  in 
rating.  After  a  Nnvy  person  has  cosplettd  the  required 
work  experience  and  is  regarded  as  a  journeyman  by  the  Depart¬ 
nent-  of  Labor,  he  not  only  ha3  civilian  proof  of  competence 
in  a  trade,  but  he  is  a  more  competent  Navy  person. 
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CHAPTER  II 

NATICNAL  APPRENTICESHIP  PROGRAM 


201.  INTRODUCTIOt-* 

1.  The  National  Apprenticeship  Program  is  a  component  program  oC 
the  Navy  Carpus  Fbr  Achievement  established  by  written  agreement  between 

tha  Secretary  of  Labor  and  the  Secretary  of  the  Navy  on  24  March  1976. 

2.  The  objectives  of  a  National  Apprenticeship  Program  are  to  «-' 
develop  highly  skilled  Navy-oriented  journeymen  who  will  continue  to 
utilize  their  technical  skills  and  knowledge  within  the  Navy,  and  who 
will  incidentally  qualify  for  enployment  in  a  recognized  civilian  trade 
after  expiration  of  enlistment  or  upon  retirement.  Adherence  tc  the  - 
standards  of  an  apprenticeship  program  will  also  reinforce  efforts 
leading  to  advancement  in  rating  by  the  individual  apprentice.  The 
success  of  the  program  will  gain  wide  recognition  of  the  worth  of  Navy 
training  and  experience. 

3.  The  Chief  of  Naval  Education  £ud  Training  will  identify  the 
trades  to  be  introduced  as  apprent  iceuble  occupations  within  the  active 
duty  N*vy.  The  identification  of  a  trade  for  an  apprenticeship  will 
depend  upon  the  following:  the  conditions  and  trends  of  the  national 
labor  scene;  the  assurance  that  selected  and  registered  active  duty 
apprentices  will  receive  work  experience  and  related  instruction  similar 
to  that  received  in  the  civilian  sector;  the  availability  of  facilities 
and  supervisory  personnel  for  on-the-job  training  and  related  instruction; 
and  the  assurance  that  administrative  procedures  and  controls  will  be 
exercised  in  a  manner  that,  will  earn  the  confidence  and  respect  of  the 
civilian  trade  sector. 


202.  TtrKlINOLOGV.  Tb  be  credible  and  usable,  the  National  Apprentice¬ 
ship  Program  Tor  active  duty  personnel  is  obliged  to  utilize  the 
terminology  employed  by  the  Bureau  of  Apprenticeship  and  Training  of  the 
U.S.  Department  of  Labor.  The  following  terms  are  the  same  as  used  in 
civilian  apprenticeship  programs: 

1.  Apprentice.  A  parson  who  is  properly  registered  with  the 
Bureau  of  Apprenticeship  and  Training  of  the  Department  of  Labor.  It  is 
to  be  noted  that  the  words  "apprentice"  and  "apprenticeship*  in  civilian 
programs  are  not  tlie  sane  as  the  apprenticeship  associated  with  pay 
grade  E-2. 


2.  dourncyron.  A  person  who  has  satisfactorily  ccnpletcd  an 
apprenticeship  arid  who  Iwo  been  awarded  a  certificate  attesting  to  this 
completion.  Navy  [yrronnol  conplct  irrj  an  autnorizod  apprenticeship 
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under  the  National  Apprenticeship  Program  will  be  awarded  a  Certificate 
of  Apprenticeship  Completion  by  the  Bureau  of  Apprenticeship  and  Train¬ 
ing,  Department  of  Labor. 

3.  Registration.  The  action  by  which  a  qualified  individual  is 
earmarked  by  tHe  Bureau  of  Apprenticeship  and  Training  of  the  Department 
of  Labor  as  an  apprentice.  Registration  of  an  individual  can  only  be 
acocrplishad  by  the  cxxnranding  officer,  officer  in  charge,  or  director 
of  the  Class  "A,"  Class  "C,"  or  Class  "J"  school  providing  related 
instruction  for  the  apprenticeable  trade.  Registration  is  acocxrplished 
by  completing  an  Apprentice  Registration  Application,  CHET  Fonn  1560/1, 
and  by  issuing  a  work  experience  log  to  the  qualified  registrant. 

4.  Wbrk  Experience.  Verified  on-the-job  participation  in  the 
skills  required  by  an  apprenticeable  trade.  It  is  the  responsibility 
of  the  individual  apprentice  to  record  the  hours  of  work  experience  cm 
a  Wbrk  Experience  Hourly  Record,  CNET  Form  1560/3,  and  have  such 
entries  verified  weekly  by  the  signature  of  the  leading  petty  officer 
or  work  center  supervisor.  Wbrk  experience  is  not  to  be  oonfused  with 
related  instruction. 

5.  Previous  wbrk  Experience.  Verified  on-the-job  participation 
in  the  skills  required  byarT apprenticeable  trade  which  were  completed 
prior  to  registration.  Hours  o'  previous  work  experience,  if  any,  are 
entered  in  the  apprentice  registration  application  only  by  the  eotunand- 
ing  officer,  officer  in  charge,  or  director  of  the  Class  “A,"  "C,"  or 
"J*  school  at  the  time  of  registration. 

6.  Related  Instruction.  The  formal  and/or  classroom  training 
acquired  at  an  applicable  Class  "A,"  "C,"  or  "J"  school,  which  must 
be  completed  as  a  condition  of  registration,  and  which  provides  the 
potential  apprentice  with  the  required  background  knowledge  and 
information  of  the  trade  Related  instruction  is  not  interchangeable 
with  work  experience. 

7.  Work  Experience  log.  A  booklet  issued  to  an  apprentice  at 
the  time  of~ regie tra tTon  and  held  thereafter  as  a  personal  possession. 
The  work  experience  log  skill  be  reviewed  twice  yearly  during  apprentice 
progress  interviews  by  the  most  accessible  Navy  Campus  education  special 
ist,  with  particular  cstphnsis  upon  tlie  Work  IXper ienco  Hourly  Record, 

C NET  Form  1560/3.  The  work  experience  log  identities  th?  apprenticeable 
trade  arci  contains  the  following  pi r ts/documen ts : 

a.  Information  for  the  apprentice. 

b.  Natioml  Apprenticeship  Standards  for  the  U.S.  Navy. 

c.  Work  Processes  Schedule  for  the  trade. 
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d.  Apprentice  Progress/Status  Report  (CNET  Tom  1560/2) . 

e.  Work  Experience  Hourly  Record  (CNET  Form  1560/3)  • 

f.  Apprentice  Registration  Application  (CNET  Form  1560/1)  • 

The  original  of  this  form  is  inserted  at  tine  of  registration. 

g.  Other  documentation.  (For  example,  oopies  of  the  service  - 
record,  page  4;  letters  from  previous  employers  attesting  to  previous 
work  experience;  oopies  of  documents  from  non  Navy  schools  attesting  to 
qualifying  related  instruction) . 

8.  Wbrk  Processes  Schedule.  A  listing  of  the  skill  areas  within 
an  apprentlceable  trade/  together  with  the  hours  of  work  experience  ' 
assigned  to  each  skill  arm  (often  referred  to  as  a  "Work  experience  plan*1). 
The  work  processes  schedule,  which  is  contained  in  the  \«rk  experience 
log,  tells  the  apprentice  how  many  hours  of  experience  must  be  completed 

in  each  skill  aroa  of  the  trade.  It  is  a  breakdown  of  the  work  experience 
to  be  ocqpleted  during  the  term  of  the  apprenticeship.  Before  an  entry 
is  made  in  the  work  experience  hourly  record,  the  apprentice  mist  refer 
to  the  work  processes  schedule  in  order,  to  identify  the  skill  area  in 
which  the  work  experience  has  been  completed. 

9.  Work  Experience  Hourly  Record.  A  form  contained  within  the 
work  exp is~usotT~ for” the  entry  and  verification  of 
completed  hours  of  on-the-job  skills. 

10.  Ctertif  icate  of  Apprenticeship  CCmpletion.  A  document  issued 
by  the  Bureau  of  Apprenticeship  and  Training  of  the  Department  of  labor 
attesting  to  the  fact  that  an  individual  has  completed  the  apprentice¬ 
ship. 


203.  REGISTRATION  PROCEDURES 

1.  In  order  to  qualify  for  registration,  an  individual  must 
have  graduated  from  the  Class  "A,"  "C,”  or  "J"  school  applicable  to  the 
apprenticcablc  trade  and  must  be  serving,  or  about  to  serve,  in  an 
authorized  apprenticoable  trade.  Authorized  apprenticeable  trades  are 
listed  in  paragraph  203. 

i 

2.  The  oofrmandirvg  officer,  officer  in  charge,  or  director  of  an 
applicable  Class  "A,**  “C,"  or  "J"  school  is  authorized  to  waive  the  Navy 
school  requirement  if  the  applicant,  can  provide  documentation  of  satis¬ 
factory  completion  of  the  required  lours  of  related  instruction  at  an 
Army,  Air  Force;,  Marine  Corps,  ov  civilian  school.  The  Department  of 
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labor  requires  144  hours  of  related  instruction  for  each  2000  hours  of 
an  apprenticeship.  Therefore,  a  6000-hour  apprenticeship  will  require 
432  hours  of  documented  related  instruction;  an  8000-hour  apprentice¬ 
ship,  532  hours,  and  a  4000-hour  apprenticeship,  388  hours. 

3.  The  registration  process  begins  by  submitting  an  Apprentice 
Registration  Application,  CNBT  Pom  1560/1  (Exhibit  II-A) ,  to  the 
camarding  officer,  officer  in  charge,  or  director  of  the  Class  "A,"  "C," 
or  "J"  school  providing  the  required  related  instruction.  Registration 
is  usually  accomplished  in  person;  however,  registration  nay  also  be 
accomplished  by  nail  if  the  eligible  applicant  sends  an  original  and  two 
copies  of  the  application,  with  the  "Applicant  Information"  entered, 

to  the  applicable  Class  "A,"  "C,"  or  "J"  school. 

4.  Personnel  who  register  by  mail  and  vho  are  also  eligible  for 
previous  work  experience  credit  in  the  trade  should  include  the  follow¬ 
ing  documents:  (See  Exhibit  II-B  for  a  reexxmended  forwarding  letter) 

a.  A  reproduced  copy  of  page  4  of  the  service  record  which 
displays  assignment  to  a  Navy  Enlisted  Classification  (NBC)  translatable 
to  previous  vork  experience.  See  paragraph  208  for  NBC’s  which  are 
authorized  for  translation  into  credit  for  previous  vork  experience. 

b.  Original  letters  or  similar  documentation  from  previous 
employers  which  attest  work  experience  in  a  skill  or  skills  associated 
with  the  apprenticoable  trade. 

5.  The  commanding  officer,  officer  in  charge,  or  director  of 
the  appropriate  school  completes  the  bottom  half  of  the  apprentice 
registration  application.  An  applicant  may  be  credited  with  1000  hours 
of  previous  work  experience  for  every  full  year  that  the  applicant's 
service  record  (page  4)  indicates  assignment  to  an  NBC  cited  in  para¬ 
graph  208  for  the  apprenticeablc  trade.  1000  hours  of  previous  work 
experience  may  also  be  credited  for  every  full  year  of  work  experience 
in  the  trade  which  is  verified  by  an  original  letter  from  a  previous 
employer. 


6.  Credit  for  previous  work  experience  cannot  exceed  more  than 
50  percent  of  the  term  of  the  apprenticeship,  i.e.,  no  more  than  3000 
hours  of  previous  work  experience  can  be  credited  to  a  6000-hour  appren¬ 
ticeship.  Portions  or  fractions  of  years  of  previous  work  experience 
will  not  be  credited.  Only  full  years  will  be  translated  awd  credited. 

7.  Upon  cxmpletion  of  all  items  on  the  apprentice  registration 
application,  tlto  oounvuxUng  officer,  officer  in  charge,  or  director  of 
the  school  will  ir.sue/mail  the  appropriate  work  experience  log  co  the 


Enclosure  (1) 


241 


CNETINST  1560. 3A 


applicant.  The  following  documents  will  be  inserted  in  the  fog 
before  issuance/mailing: 

a.  One  copy  of  the  completed  apprentice  registration 
application. 

b.  Service  record  pages  and  letters,  if  a rv’,  cited  in 

4  above. 

8.  TWo  copies  of  the  completed  apprentice  registration  application 
will  be  forwarded  to  the  Chief  of  Naval  Education  and  Training. 


204.  CANCELLATION  OF  REGISTRATION 

1.  A  registration  will  be  canceled  for  arty  one  of  the  following 
reasons: 


a.  Request  of  the  apprentice. 

f 

b.  A  rating  in  the  lower  50  percent  in  "Professional 
Oorapetdhce"  on  the  Enlisted  Performance  Evaluation. 

c.  Discharge  or  release  to  inactive  duty. 

d.  Termination  of  work  experience  in  the  apprcnticeable 
trade  for  a  period  of  irore  than  one  year. 

e.  Failure  to  report  to  a  Navy  Campus  education  specialist 
for  twice  a  year  progress  interviews  unless  the  apprentice  has  requested 

suspension  of  registration. 

*  , 

2.  Cancellation  of  registration  is  acoonplished  through 
submission  of  an  appropriately  checked  apprentice  progress/status  report 
(Exhibit  II-C)  signed  by  one  of  the  following  personnel; 

a.  The  commanding  officer  or  officer  in  charge  of  the 
apprentice. 

b.  A  Navy  Campus  education  specialist.  * 

^  '' 

c.  The  Chief  of  Naval  Education  arri  Training. 

d.  The  apprentice. 
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3.  Cancellation  of  registration  is  tantamount  to  removal  from 
the  apprenticeship  program.  Once  a  registration  is  canceled,  an 
individual  can  reenter  the  apprenticeship  program  only  by  reapplying 
for  registration.  The  Chief  of  Naval  education  and  Training  will 
adjudicate  all  cases  of  application  for  reregistration  and  will  deter¬ 
mine  whether  reregistration  of  an  active  duty  meatier  will  be  permitted. 

4.  Cancellation  of  registration  should  not  be  confused  with 
suspension  of  registration. 


20 5.  SUSPENSION  ClT  RBGISTRATia; 

1.  Suspension  of  registration  is  temporary.  Suspension  retains . 
the  apprentice  in  a  temporary  inactive  status  for  no  more  than  one  year 
but  still  enables  the  apprentice  to  accumulate  hours  of  work  experience 
for  entry  and  verification  on  the  work  experience  hourly  record. 

2.  Suspension  is  accomplished  by  submission  of  an  appropriately 
checked  apprentice  progress/stafcus  report  (Exhibit  II-C)  to  the  Chief 

of  Naval  Education  and  Training  signed  by  one  of  the  following  personnel: 

a.  The  ccninanding  officer  or  officer  in  charge  of  the 
apprentice. 

b.  A  Navy  Campus  education  specialist. 

c.  The  Chief  of  Naval  Education  and  Training. 

d.  The  apprentice. 

3.  Suspension  will  be  granted  by  the  Chief  of  Naval  Education 
and  Training  and  the  Department  of  tabor  for  any  one  of  the  following 
reasons: 

a.  If  the  apprentice  is  unable  to  oonplete,  for  reasons 
beyond  control,  work  experience  in  the  apprenticaable  trade  fa:  a 
period  of  one  year  or  less;  that  is,  hospitalization,  orders  to  light 
duty,  assignment  to  duties  not  related  to  the  trade  in  which  registered. 

b.  If  the  apprentice  is  unable  to  report  for  a' required 
apprentice  progress  interview  because  of  operational  requirements  or 
because  a  Navy  Campus  education  specialist  is  not  available. 

4.  A  suspension  will  be  lifted  if  the  apprentice  resumes  work 
experience  in  tlve  apprenticooble  trade  within  one  year  after  date  of 
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suspension  and  reports  to  a  Navy  Campus  ed-xation  specialist.  The 
Navy  Campus  education  specialist  will  subnit  an  appropriately  checked 
and  dated  apprentice  progress/status  report  and  will  include  hours  of 
work  experience  completed  during  the  suspension,  if  any.  ...  . _ 

5.  Suspension  does  not  require  re-reg i sfcration .  The  Chief  of 
Naval  Education  and  Training  will  examine  all  cases  of  repetitious 
suspensions  and  will  determine  whether  cancellation  is  justified. 

6.  An  apprentice  is  urged  to  earn  work  experience  hours,  however 

small,  during  a  suspension.  .  «- 


206.  KES POSSIBILITIES  TOR  THE  APPRENTICESHIP  PROGRAM 

1.  *flie  Chief  of  Naval  Education  and  Training 

a.  Function  as  the  single  point  of  oontact  with  the  Depart¬ 
ment  of  labor  regarding  aU  policy  and  management  aspects  of  the  Navy’s 
National  Apprenticeship  Program  for  active  duty  personnel. 

b.  Provide  policy  guidance  for  the  operation  and  management 

of  the  National  Apprenticeship  Program  for  active  duty  personnel.  -  • 

c.  Identify  the  trades  which  cure  to  be  introduced  as 
apprenticeable  trades  witlvin  the  active  duty  Navy. 

d.  After  consultation  with  the  Department  of  Labor,  assign 
responsibility  for  promulgation  of  “work  processes  schedules"  for 
designated  apprenticoable  trades  and  ensure  the  introduction  of  these 
schedules  into  appropriate  records. 

e.  *2valuata  the  overall  effectiveness  of  th-i  Navy's  National 
Apprenticeship  Program  for  active  duty  personnel  and  ensure  that, 
acceptable  levels  of  proficiency  for  npprenticeafc i e  trades  are  being  met. 

f .  Screen  and  forward  apprenticeship  registration,  suspension, 
cancellation,  and  completion  actions  and  individual  apprenticeship  progress 
reports  to  the  Department  of  labor  as  required. 

2.  *B>e  Chief  of  Naval  Education  and  Traigiag  Support 

a.  Arrange  for  the  printing,  stocking,  and  distribution  of 
the  forms  prescribed  for  the  National  Apprenticehsip  Program  for  active 
duty  personnel . 
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b.  Ensure  the  indoctrination  and  effectiveness  of  Navy  Canpus 
education  specialists  regarding  the  procedures,  controls,  and  action 
required  to  provide  efficient  field  management  and  maximum  assistance 
to  commanding  officers  regarding  the  National  Apprenticeship  Program 
for  active  duty  personnel. 

3.  Contnanding  Officers,  Officers  in  Charge,  or  Directors  of 
Class  ‘A,w~ "C,"  or  *J*  Schools 

a.  Register  volunteer  members  as  apprentices  if  they  have 
completed  the  required  related  instruction.  The  following  will  be 
regarded  as  required  related  instruction: 

(1)  Successful  completion  of  the  course  appropriate  to  the 
trade  listed  in  paragraph  203;  or 

(2)  Satisfactorily  documented  conpletion  of  the 
required  hours  of  related  instruction  at  an  Amy,  Air  Fbrce,  Marine 
Corps,  or  civilian  school.  See  paragraph  203.2  for  the  hours  of 
required  related  instruction  earned  at  a  non  Navy  school. 

b.  Ensure  that  eligible  registrants  are  credited  only  with 
documented  hours  of  previous  work  experience  in  accordance  with 
paragraphs  203.4  through  203.6. 

c.  Issue  the  appropriate  work  experience  log  to  volunteer 
members  at  the  time  of  their  registration. 

d.  Ensure  that  registrants  are  counseled  as  to  the  con¬ 
ditions  and  requirements  of  their  apprenticeshio.  If  desired,  request 
the  expertise  of  an  accessible  Navy  Carpus  education  specialist  for 
this  counseling. 

4 .  Oontnanding  Officers  of  Registered  Apprentices 

a.  Cancel  the  registration  of  personnel  for  any  of  the 
reasons  listed  in  paragraph  204.1. 

b.  Suspend  the  registration  of  personnel  for  any  of  the 
reasons  cited  in  paragraph  205.3. 

,.r  \ 

c.  Upon  completion  of  all  required  lours  of  work  experience, 
urge  an  apprentice  to  submit  a  request  for  issuance  of  a  Certificate 
of  /apprenticeship  Completion. 

d.  Ensure  tha;  legitimate  "hours  completed"  entries  <ue 
made  on  the  work  experience  hourly  record  in  tlx*  work  experience  logs 
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of  apprentices  and  that  these  entries  are  verified  weekly  by  leading 
petty  officers  or  work  center  supervisors. 

e.  Urge  am  apprentice  to  report  twice  a  year,  with  work 
experience  log,  to  the  roost  accessible  Navy  Carpus  education  specialist 
for  am  apprentice  progress  interview. 

f .  When  desired,  request  the  services  of  am  accessible 

Navy  Cacpcs  education  specialist  for  the  actions  outlined  in  4a,  b,  and. 
c  abovs. 


5.  Navy  campus  Education  Specialists  . 

a.  Provide  maximum  assistance,  advice,  and  guidance  to 
oomnanding  officers,  officers  in  charge,  and  directors  of  Class  "A,’' 

"C,"  and  "J“  schools  for  the  registration  of  eligible  personnel  and  for 
the  counseling  of  registrants  as  to  the  conditions  and  requirements  of 
apprenticeships. 

b.  Provide  maximum  assistance,  advice,  and  guidance  to 
oomnarding  officers  of  registered  apprentices  for  the  cancellation  or 
suspension  of  registrations. 

c.  As  permitted  or  requested  by  commanding  officers,  assist 
individual  apprentices  in  requesting  a  Certificate  of  Apprenticeship 
Oonpletion  af  ter  all  required  hours  of  work  experience  have  been  coat- 
p^etod.  !*<.  preparing  the  final  rpprontice  progress/status  report, 

the  Favy  Orpus  education  specialist  will  regard  previous  work  experience 
as  distributed  proportionately  among  the  skill  areas  of  the  applicable 
work  processes  schedule. 

c.  Be  avail  a.1  ie  for  twice  yearly  apprentice  progress  inter¬ 
views  aid  reviews  of  work  experience  logs.  The  Navy  Campus  education 
specialist  will  inspect  the  work  experience  log  of  the  apprentice  and 
complete  the  appropriate  blocks  of  t)ie  apprentice  progress/status  report. 
A  professional  advisement  session  will  also  be  axducted,  if  appropriate. 

e.  Submit  two  copies  of  the  apprentice  progress/status 
report  to  zr.e  Chief  of  Naval  Itfucation  and  Training,  together  with  one 
reproduce::  copy  of  any  work  experience  hourly  record  which  hac  been 
verified  sires  the  last  apprentice  progress  interview.  One  copy  of  the 
apprentice  progress/s ta tu s  report  will  be  inserted  \n  tiw  work  experi¬ 
ence  log  of  the  apprentice. 

f.  Fa  in  tain  modest  stocks  of  forms  required  to  administer 
subject,  program.  Provide  copies  of  required  forms  to  fyprcntices  and 
potential  apprentices  on  an  individual  basis  as  requested. 
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6.  Individual  Apprentices 

a.  Request  registration  after  successful  ooqpletlon  of  the 
applicable  Class  "A,"  "C,"  or  "J"  school. 

b.  At  the  time  of  registration,  request  the  appropriate 
work  experience  log  and  thereafter  a  same  responsibility  for  its  safe¬ 
keeping. 


c.  Request  credit,  if  eligible,  for  previous  work  experience 
validated  by  paga  4  service  record  entries  or  letters  frotn  previous 
employers.  Assume  responsibility  for  prociirecent  of  this  documentation. 

d.  Enter  oarpleted  hours  of  work  experience  in  the  work 
experience  hourly  record  of  the  work  experience  log  and  have  entries 
verified  by  the  leading  petty  officer  or  work  center  supervisor. 

e.  Report  twice  yearly,  with  work  experience  log  in  l Wind, 
to  tho  most  accessible  Navy  Campus  education  specialist;  for  an 
apprentice  progress  interview  and  submission  of  an  apprentice  progress/ 
status  report.  Whenever  possible,  these  interviews  should  be  scheduled 
at  least  five  months  apart. 

f.  If  operational  rexjuirerents,  hospitalization,  assignment 
tc  duties  not  related  to  the  trade  in  which  registered,  or  inaccessi¬ 
bility  of  a  Navy  Gnpur  education  specialist  prevent  twice  yearly 
progress  interviews,  request  temporary  suspension  of  registration  in 
accordance  with  paragraph  205. 

g.  Upon  completion  of  all  hours  of  v.ork  experience  required 
by  the  apprenticeship,  report  for  a  final  apprentice  progress/status 
interview.,  Hie  final  report  will  require  verification  of  total  lours 

of  work  ctxpericnoe  in  each  rik ill  area  by  cotnranding  officer  or  accessible 
Navy  Campus  education  specialist  and  submission  of  one  reproduced  copy 
of  any  work  experience  ixxirly  reoord  which  leas  been  verified  since  the 
last  apprentice  progress  interview. 


207  •  AVAIt  ABILITY  0 F  FOHH3  AM)  RES  11*333  REPOSTS 

1.  Sample  copies  of  work  experience  logs  for  apjkre^t icoable 
trades  will  be  provided  to  Navy  Gnpur.  education  specialists  by  the 
Chief  of  liaval  Education  and  Trait  ing  Support.  Only  oorutunding  Officers, 
officers  in  charge,  and  directors  of  Class  "A,"  ’’C,"  or  ”J”  schools 
providing  related  instruction  for  designated  apprentlceable  trades  are 
authorized  to  issue  work  oqxuicnbe  logs  to  registered  apprentices. 
Available  work  ex}??xicntx'  logs  are  listed  in  Appendix  A. 
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2.  Modest  supplies  of  the  following  for  .ns  will  be  provided  to 
Navy  Corpus  education  specialists  and  to  Class  “A,"  “C,"  and  "J" 
schools  providing  related  instruction  for  designated  apprenticeable 
trades: 

Apprentice  Registration  Application,  CNffT  Form  1560/1 
Apprentice  Progre&s/Status  Rsport,  QET  Form  1560/2 
POrk  Experience  Hourly  Record.,  CHET  Ftarm  1560/3. 

3.  Replacement  supplies  of  the  forms  listed  in  the  above 
paragraphs  may  bet  obtained  z/j  letter  request  to  the  Chief  of  Naval 
Education  and  Training  Support,  Pensacola,  FL  32509.  The  forms  will 

not  be  provided  directly  to  individuals  or  to  ccemands  other  than  those^  * 
cited  above.  Individuals  having  a  need  for  arty  of  the  forms  will 
obtain  them  from  the  most  accessible  Navy  Campus  education  specialist. 

4.  Reports  required  for  the  National  Apprenticeship  Program  have  — 
been  approved  by  the  Chief  of  llasal  Operations.  Report  symbols  apply 

as  follows: 

Report  Syntol  Title  Ft»m 

CNET  Report  1560-4  Apprentice  Registration  Application  CNET  Fora  1560/1 
CNET  Report  1560-5  Apprentice  Progress/Status  Report  CNET  Fbrm  1560/2. 


200.  AUTHORIZED  TRADES.  The  Chief  of  Naval  Education  and  Training  and 
the  DupirLment  of  Iailbr  have  author i  zed  the  followbg  oppronticeable 
trades  within  Um;  active  duty  Jlavy: 

1 .  Office  Machines  .‘Vxrrardc 

a.  Term  of  apprenticeship:  6000  hours  of  specified  and 
recorded  work  experience. 

b.  Dictionary  c:  Occupational  Titles  (DOT)  Ode:  633.281.034. 

c.  All  required  related  instruction  satisfied  by  completion 
of  any  three  of  the  listed  packaged  courses  at  the  Service  Schools 
CtrrmandT  Naval  Training  Gr.ter,  Cr?at  lakes,  IL  60088,  provided  that 
the  applicant  bus  previously  carpletcd  Uie  Instnsraitman,  Class  “A" 
course.  Personnel  wto  are  not  graduates  of  the  Ir,s!:ru®entman,  Class  V 
course  are  required  to  card  etc  any  five  of  thc'si.\  courses  in  order  to 
qualify  for  registration. 

A- 6? 0-00 2 8  1RM  Ssieetric  Typewriter  Rrpair 

A-670-0029  IBM  C7I  Electric  Typewriter  Repair 
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A-670-0031  Remington  Adder,  Model  4,  Repair 
a-670-0032  Friden  Calculator,  Model  3W,  Repair 
A- 670-0034  Electronic  Calculator  Repair 

A- 57 0-004 5  Bell  and  Howell  Ditto  OcRfcm&tic  Copier/Duplicator , 

Model  4 4 IN,  Repair. 

d.  Source  Rating:  Instrunentman  (IM) . 

e.  NEC’s  which  can  be  translated  into  credit  for  previous 

work  experience:  None. 

2.  watch -Clock  Repairer 


a.  Verm  of  apprenticeship:  6000  tours  of  specified  and 
recorded  work  experience. 

b.  DOT  Code:  715.28i.034. 

C.  All  required  related  instruction  satisfied  by  completion 
vl  Watch  Repair  Course,  instrumentman  Class  "Cl"  (A-670-0011) ,  Service 
Schools  Comroind,  Naval  Training  Center,  Great  Lakes,  II.  60089. 

d.  Source  Rating:  Instrumentman  (IM). 

e.  NEC's  which  can  be  translated  into  ertdit  for  previous 
work  experience  on  a  full  year  basis  up  to  a  r.vixiinum  of  3000  hours: 
IM-1812. 


3.  Goranercial  Photographer 

a.  Term  of  apprenticeship;  6000  (wars  of  specified  and 
recorded  work  experience. 

b.  DCTCOde:  143.C62.034. 

c.  All  required  related  instruction  satisfied  by  completion 
of  Ttotographer’s  Mate  Schorl,  Class  "Al"  (level  1)  (C-4 00-2011)  or 
Pho tographer ' r.  Mate  School,  Class  "C7"  (level  2)  (C-400-2012) ,  Naval 
Technical  Training  Center,  Curry  Station  Detachment  (Photo  School), 
({aval  Air  Station,  iVnsaoola,  II.  32508. 

% 

d.  Sou  roc  Rating:  Photographer's  Mate  (PS).'  * 

e.  NEC's  which  can  be  translated  into  credit  for  previous 
work  experience  on  a  full  year  basis  up  to  a  raxiircin  of  2000  (tours: 
Pil-9148. 
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a.  Term  of  apprenticeship:  4000  hours  of  specified  and. 
recorded  Mark  experience. 

b.  DC>r  Code  714.281.014. 

c.  All  required  related  instruction  satisfied  by  ooctpietion 
of  Photographic  equipment  Repair  Course,  Class  “Cl"  (Level  3) 
(C-670-2Q12) ,  Naval  Technical  Training  Unit,  Oorry  Station  Detadwent 
(Photo  Sclcol) ,  Naval  Air  Station,  Pensacola,  PL  32503. 

d.  Source  Rating:  Photographer's  Mate  (PH). 

e.  NEC's  which  can  be  translated  into  credit  for  previous 
work  experience  on  a  full  year  basis  up  to  a  maximum  of  2000  hours: 
PH-8192. 


5.  Cbok  (Hotel  nnd  Restaurant) 

a.  Term  of  apprenticeship:  60CO  hours  of  specified  and 
recorded  work  experience. 

b.  DOT  C Ode:  313.381.022. 

c.  All  inquired  related  instruction  is  satisfied  by  completion 
of  one  of  the  following  courses: 

Mesa  Management  Special  iat  Course,  Rood  Production, 

Class  "Cl"  (A-800-0018) ,  Service  Schools  Cbrrnvincl,  Naval  Training  Center, 
San  Diego,  CA  92133.  (Ftot-aorly  Ctxnr-i ssaryuvw/Sbcward ,  Class  "Cl" 
(A-800-0018) ,  Pocd  Production  Course). 

Moss  Management  Specialist  Course,  Management  Principles, 
Class  "Cl"  (A- 800- 001 5,’ ,  Service  Schools  Oorrmand,  Naval  Training  Center, 
San  Diego,  CA  92133  or  Fleet  Training  Center,  Norfolk,  VA  23511. 

d.  Source  kiting:  Moss  Management  Sjiecinlist  (MS). 

e.  NEC's  which  can  bo  translated  inh?  credit  for  previous 
work  experience  on  a  full  year  basis  up  to  a  iruXiWn  of  3000  hours: 
hS-3503;  MS- 3527;  MS-3528;  KS-3529;  MS-3531;  MS-3532;  MS-3533. 
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APfKEKTiCE  REGISTRATION  APPLICATION 
chit  r.»-  tue/i  H  ?» 


CHIT  tfrt  ISM-i 


1.  Print  w  l>t  «. 

2.  Piifu*  t»  TiifliMU. 

X  and  vox  toff  lo  CSllT. 

1  Appt.atlt.  o-a.  Is  Cv.it  Kapeiier.ce  Lug. 

PRIVACV  ACT  KOTI.-l  CATION 

UnJcr  tlx  authority  vt  i  USC  JOt.  tht  information  re pardtr.y  $,out  fa'mrr  actia  t  military  aervtce.  rdneatieraal 
t-aekgruund  anil  pi.arnt  prison*)  data  U  rxiuMid  in  otci:  is  and  evaluate  your  Quslilicstioaa  for  th< 

Department  vt  Labor  •piatrorictthlp  pr«£.arn  fcr  rttlv.  datv  !f*»y  prttonnt!  Your  aociai  wturltv  numbai 
*'*"  *•'*  iot  pu!po-»a  c  individual  iisnt.t.tation  Ten  information  will  be  .’eutnrd  by  CN’ldT  (Cod. 
N-U)  and  by  th*  Bureau  o'  Apprtnikf ab.p  and  T.-alnin..  U  S.  n.-pa.trrmt  o!  Lobar,  and  will  not  be' 
rlxalsr.!  witSoait  youfi.Mllt,-  author, ia!  .sr.  :o  o.lirr  tbs,,  CNlltinJ  OOL  j>e,so.m«l  lavuiaed  with 

th?  JammU  tut  toft  oi  t».i>  yto^iAtt:  \ou  itr  n o'.  ret, ultra  t-j  provide  thU  iofomution;  however,  folium  tt)  do 
•Off^y  tfiull  in  your  not  t*»n*  tr^utered  for  en  vpftmlcenM*  *ude  «n4  win  militate  a#j»lnu  Um  Navy 
Camput  foe  Achtwetnent  to  W  *Me  to  pu~it!e  an  tiiootion  a»ul  Minin-  tervUc  to  you. 


>Aiigywj?HLmaitiNgi 


!S.  Total  VowM  ferjairr,!  for  trim  of  »ppr<-i»i<-r»Sip 
IS.  Haul  •  i  K'dit  given  for  prrvinw*  uotl  i  vpi-nenc* 


I  Laura 

(-)  Jtg5  Hour* 
Houra 
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Enclosure  (l) 


29  July  1976 


Fran:  B42  .John  K.  MELCWIG,  USN,  399-64-6483 

USS  SRE2JAND0AH  (AD-26)  R-5  Division,  FfO  tow  York  09501 
7b:  Director,  CHAH  Schools,  Service  Schools  Oannand,  Great  Lakes, 

Illinois  60038 
Via:  Commanding  Officer 

Subj:  Registration  in  National  Apprenticeship  Program;  request  for 

Enel:  (1)  Apprentice  Registration  Application  (triplicate) 

(2)  Copy  of  service  record,  page  4 

1.  Enclosure  (1)  is  forwarded  for  caipledLon  of  registration.  I  com¬ 
pleted  the  Watch  Repair  Course,  Instrumentation  Class  "Cl*  (A-670-0011) 
on  5  June  1975  and  have  therefore  satisfied  the  related  instruction 
requirements  for  the  Watch -Clock  Repair er  apprenticeship. 

2.  I  request  credit  for  man  hours  of  previous  work  experience  rased 
upon  my  assignment  to  the  Navy  Enlisted  Classification  IM-1812  since 
7  June  1975.  Enclosure  (2)  is  forwarded  in  verification  of  this  NEC. 

3.  It  is  further  requested  that  the  Wbrk  Experience  log  for  the 
Hitch-Clock  Repairer  apprenticeship  be  forwardtd  to  me  so  that  I  may 
begin  to  record  my  work  experience  on  a  regular  basis. 


JOCIN  II  MEIXWIG 


30  July  1976 

FIRST  E&XJRSEdENT  on  IM2  John  II.  MELCWIG  ltr  of  29  Jul  1976 

From:  Commanding  Officer,  USS  SIIENANDOAH  (AD-26) 

7b:  Cotmandintj  Officer,  Service  Schools  Ctxnmnd  (Director,  OryiM 

Schools),  Great  lakes,  Illinois  60038 

Subj:  Registration  in  National  Apprenticeship  Program;  request  for 

1.  Forwarded,  rocafrmendlng  approval. 


L.M.  KOUSEUK 
Dy  direction 


EXHIBIT  II-B 
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DANTES  Discussion  Panel 


THE  COMMUNITY  COLLEGE  OF  THE  AIR  FORCE:  TRADITIONAL 
TECHNICAL  EDUCATION  IN  THE  NON- TRADITIONAL  MODE 

Maj  Nil  11  a*  A.  Wojclechowskl 


OVERVIEW :  Education  program  development  within  the  Commu¬ 
nity  College  of  the  Air  Force  (CCAF)  it  consistent  with  its 
philosophy  and  in  line  with  trends  in  both  Air  Force  and 
civilian  occupational  education.  Simply  stated,  CCAF  intends 
to  focus  the  educational  effort  of  airmen  toward  obtaining  a 
high  quality,  occupationally  oriented  career  education  which 
will  enable  them  to  serve  as  a  master  of  and  a  supervisor  in 
their  specialty,  either  in  the  Air  Force  or  in  civilian  life. 

The  U.S.  Office  of  Education  has  estimated  that  80%  of 
the  work  force  will  require  occupational  skills  in  the  1980's 
as  contrasted  to  baccalaureate  skills.  Concurrent  with  this 
changing  emphasis  jo  needed  educational  skills,  we  have  seen 
the  Air  Force  enlisted  force  decline  from  more  than  800,000 
personnel  to  slightly  more  than  500,000  in  less  than  5  years. 
With  this  reduction  has  come  an  expansion  of  technology  re¬ 
quiring  our  personnel  to  be  familiar  with  more  complex  weap¬ 
on  systems.  Moreover,  we  see  a  growing  need  for  more  so¬ 
phisticated  supervisory  skills.  Past  incidents  of  notable 
unrest  indicate  that  knowledge  of  technology  alone  is  inade¬ 
quate-skill  in  human  relations  acquired  through  an  under¬ 
standing  of  the  social  sciences  and  the  humanities  is  neces¬ 
sary.  Therefore,  the  Community  College  of  the  Air  Force  has 
systematically  plotted  the  career  education  of  Air  Force  en¬ 
listed  personnel  to  meet  the  demands  of  changing  technologies 
and  to  develop  the  social  awareness  necessary  for  the  effec¬ 
tive  management  of  people. 

This  paper  depicts  a  model  for  the  continuous  develop¬ 
ment  of  CCAF  programs  and  is  a  scenario  describing  the  CCAF 
model  in  action.  It  explains  how  occupational  requirements 
are  translated  into  needs  and  interacted  witn  resources  to 
form  th  r  basic  from  which  CCAF  programs  are  developed.  Dis¬ 
cussion  follows  on  the  process  of  quality  control  as  it  is 
served  by  the  CCAF  Policy  Council  and  the  consultant  advisory 
panel  for  each  prr  pram.  After  review  and  approval  of  the 
program,  articular  on  between  CCAF,  the  Education  Services 
Officer  and  the  airman  is  defined.  Finally,  follow-up  is 
discussed  in  terms  of  a  feedback  loop  which  provides  conti¬ 
nuity  to  the  entire  process. 

OCCUPATIONAL  REQUIREMENTS ;  Requirements  for  specific  jobs 


in  the  Air  Forco  are  determined  from  a  review  of  programming 
documents,  such  as  the  Personnel  Manpower  Change  Program 
series,  which  indicate  numbers  of  personnel  required  in 
specific  occupational  specialties.  The  documents  also  reflect 
how  many  airmen,  and  with  what  training,  leave  the  Air  Force 
annually.  The  quality  of  airmen  needed  is  indicated  in  Air 
Force  Manual  39-1,  Airman  Classification  Manual,  and  through 
means  such  as  job  inventory  programs  conducted  by  the  Occu¬ 
pational  Measurement  Center  of  the  Air  Training  Command,  the 
Air  Force  Human  Resources  Laboratory,  and  feedback  from  using 
agencies. 

Since  it  has  been  estimated  by  the  Aerospace  Education 
Foundation  and  tan  Air  Force  Human  Resources  Laboratory  that 
from  60%  to  90%  of  Air  Force  occupations  have  civilian  coun¬ 
terparts,  we  believe  a  review  of  civilian  occupational  re¬ 
quirements  is  necessary  as  part  of  our  program  development 
process.  The  Department  of  Labor's  Occupational  Outlook 
Handbook  and  comparable  state  guides  are  reviewed  to  defcer- 
mXr.e  projected  requirements  for  specific  occupations.  Long 
range  projections  are  provided  by  the  Department  of  Labor's 
Industry-Occupational  Matrices  and  the  President's  manpower 
report. 

An  assessment  of  the  quality  of  education  required  is 
then  made  through  a  review  of  guidelines  such  as  those  found 
in  the  USOE  Technical  Education  Program  series  and  Occupa¬ 
tional  Criteria  and  Preparatory  Curriculum  Plans  in  Technical 
Education  Programs.  Curriculum  guides  such  as  those  tor 
aviation  and  electro-mechanical  careers  published  by  the 
American  Association  of  Community  and  Junior  Colleges  are 
also  beneficial.  The  ASEE*  Engineering  Technology  Education 
Study  has  been  useful  as  a  means  of  defining  parameters  of 
education  for  technicians.  Guidelines  prepared  by  other 
agencies  such  as  the  International  Association  of  Firefight¬ 
ers  and  the  Texas  Commission  on  Law  Enforcement  Education 
have  been  studied  and  incorporated  wherever  possible  to 
provide  the  necessary  prerequisites  for  future  licensing  and 
certification  of  CCAP  gradvates. 

RESOURCES :  Prior  to  the  existence  of  the  CCAF's  educational 
programs  for  onlisted  persons,  little  emphasis  was  given  to 
tying  together  service  provided  instruction  and  the  related 
technical  and  general  education  available  to  the  airman 
through  a  variety  of  sources.  The  Air  Force  provides  tech¬ 
nical  training  via  some  2,500  to  4,000  technical  training 
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courses  to  approximately  300,000  students  per  year.  Approx¬ 
imately  80,000  students  attend  resident  technical  courses 
upon  entering  the  Air  Force.  In  addition,  airmen  receive 
management  instruction  through  a  series  of  NCO  Academies  and 
specialised  instruction  from  agencies  such  as  the  School  of 
Aerospace  Medicine.  Also,  there  is  an  extensive  system  for 
providing  work  experience  combined  with  on-job  instruction 
through  a  dual-channel  OJT  program— work  and  study.  Comple¬ 
tion  of  this  period  of  apprenticeship  is  documented  attesting 
to  an  individual's  ability  level  to  perform  skills  necessary 
in  a  particular  specialty.  This  supervised  training  is 
identified  as  CCAF 'a  Internship  program  for  which  credit  is 

Most  of  the  aforementioned  instruction  is  specifically 
designed  to  prepare  airmen  either  as  technicians  or  as 
managers/leaders.  CCAF  integrates  this  instruction  with 
courses  in  related  general  education  available  from  almost 
400  civilian  institutions  which  are  associated  with  our  172 
Base  Education  Services  Centers.  Included  in  CCAF's  programs 
as  an  option  is  limited  credit  by  examination  offered  through 
DANTES  (Formerly  the  Armed  Forces  Institute  (USAFI)). 

CCAF  CONTROL:  If  these  myriad  forms  of  instruction  are  to 
be  focused  toward  career  relevant  education  for  airmen,  pro¬ 
gram  control  must  be  exercised  by  a  central  agency.  CCAF 
serves  this  function  for  the  Air  Force.  The  Program  Develop¬ 
ment  Division  of  CCAF  is  charged  with  the  responsibility  for 
analyzing  service  instruction  to  determine  which  parts  of  Air 
Force  instruction: 

o  Are  at  a  civilian  post -secondary  level.  (This  eval¬ 
uation  is  made  by  program  administrators,  analysts, 
and  department  heads  who  have  an  average  of  18  years 
experience  in  their  occupational  specialties  as  well 
as  undergraduate  and  graduate  degrees) . 

o  Have  civilian  applicability  and/or  are  occupationally 
related  including  that  which  is  exclusively  Air  Porce 
oriented.  Subject  matter  which  meets  these  basic 
criteria  are  evaluated  on  the  basis  of  an  average  of 
30  contact  hours  of  instruction  being  ecruivalent  to 
one  semester  hour.  CCAF  credit  is  applicable  toward 
a  Community  College  of  the  Air  Force  Associate  in 
Applied  Science  (AAS)  Degree. 
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Using  service  instruction  as  a  core  and  guidelines  estab¬ 
lished  by  USOE  and  other  standard  setting  agencies  for  2-year 
occupationally  oriented  associate  level  programs ,  a  basic 
career  pattern  for  the  AAS  was  established.  Within  a  total 
minimum  length  of  64  semester  hours,  sub-minima  of  24  semester 
hour s  technical  education  related  to  an  airman's  Air  Force 
occupation,  25  semester  hours  of  general  education  designed 
for  personal  enrichment  and  to  enhance  supervisory  skills, 
and  6  semester  hours  of  management  instruction  make  up  the 
basic  program  pattern.  Many  technical  electives  and  general 
education  courses  in  the  program  are  obtained  from  accredited 
institutions  by  aliituUi  uui uls  u^i-uuty  time,  eit-her 
with  the  Air  Force  providing  75  percent  tuition  assistance 
or  with  VA  assistance.  Currently,  only  24  hours  of  credit 
by  examination  may  be  used  toward  this  requirement. 

As  is  cleai  from  the  foregoing,  we  are  interested  in 
focusing  students'  efforts  on  education  related  to  their 
occupational  specialties  within  the  Air  Force  in  order  to 
maximize  job  performance  and  to  enhance  opportunities  for 
post  service  employment.  Therefore,  after  outlining  64  hour 
minimum  program  patterns,  the  Program  Development  Division 
has  reviewed  all  Air  Force  occupational  specialties  and 
clustered  them  into  85  programs  of  study  in  5  major  career 
areas,  e.g.,  in  Management  and  Logistics  there  are  programs 
in  general  business,  business  management,  and  computer  sci¬ 
ence.  In  the  electro-mechanical  career  areas,  some  17  pro¬ 
grams  were  developed.  Each  of  these  was  constructed  using 
the  best  available  curriculum  guidance.  For  example,  U.S. 
Office  of  Education  guidance  for  technical  education  indi¬ 
cates  that  programs  should  provide  students  with  a  facility 
in  mathematics,  physical  science  principles  relative  to  tech¬ 
nical  skills,  and  ability  in  communications  skills  along  with 
the  knowledge  of  a  particular  occupational  specialty.  To 
this  core  we  have  added  a  facility  in  social  sciences,  hu¬ 
manities,  and  management  to  insure  that  airmen  attaining 
supervisory  positions  develop  a  social  awareness  necessary 
to  coping  with  the  increasingly  complex  demands  on  personnel 
in  management  positions. 

After  the  85  programs  were  initially  developed,  they 
were  reviewed  by  the  CCAF  Policy  Council  (annual  review) , 
chaired  by  the  Dean  and  consisting  of  members  of  the  College 
staff  to  include  experts  in  the  areas  being  reviewed.  The 
programs  were  reviewed  against  criteria  such  as  the  follow¬ 
ing? 
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o  Does  t-he  program  provide  for  Air  Force  occupational 
needs? 

o  Does  the  program  1.  /e  >  civilian  occupational 
orientation? 

o  Does  the  program  meet  the  established  minimum  of  64 
semester  hours? 

o  Does  the  program  satisfy  minimum  criteria  for  accred¬ 
itation  such  as  those  expressed  in  the  Southern  Asso¬ 
ciation's  Commission  on  Colleges  guidelines? 

After  modifications  suggested  by  the  Policy  Council  to 
the  programs  are  made,  advisory  panels  of  consultants  are 
brought  into  action. 

ADVISORY  PANELS:  To  provide  assurance  that  programs  devel- 
oped  and  controlled  by  CCAF  are  consistent  with  Air  Force 
needs  and,  where  possible,  civilian  requirements,  we  have 
identified  advisory  panels  of  consultants.  They  are  repre¬ 
sentatives  front  the  technical  schools  which  offer  instruction 
reflected  in  the  programs;  representatives  from  the  Air 
Training  Command  Technical  Training  Directorate  or  Surgeon 
General;  and  representatives  from  business,  education  or 
industry.  Wherever  possible,  representatives  from  appro¬ 
priate  professional  organizations  are  asked  to  review  CCAF 
programs  and  comment  on  them  in  terms  of  adequacy  in  pre¬ 
paring  individuals  to  fulfill  professional  duties  at  tne 
technician  level.  If  licensing  or  certifying  agencies  exist 
for  an  occupational  specialty,  members  of  those  agencies  -.re 
also  asked  to  review  appropriate  programs  to  determine  their 
adequacy  in  preparing  students  for  certification  and  licens¬ 
ing.  Finally,  in  those  areas  where  employment  entry  is  dom¬ 
inated  by  unions  or  specific  industries,  members  of  the 
unions  or  industry  may  also  be  asked  to  comment.  As  an  ex¬ 
ample,  CCAF  is  interested  in  how  its  programs  meet  apprentice 
requirements.  Although  these  advisory  panels  cennot  be 
formalized  because  of  federal  regulations,  informal  contacts 
are  continuously  occurring.  Consultants  representing  the 
groups  noted  above  are  employed  and  formally  review  CCAF 
programs.  Their  reports  are  available  for  examination. 

PROGRAM  DISTRIBUTION /GUI DANCE {  Subsequent  to  annual  review 
by  advisory  panels,  programs  are  modified  as  necessary  by 
the  staff  of  the  Program  Development  Division  and  formally 
approved  by  the  Policy  Council.  Thereafter,  the  programs 
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will  be  incorporated  into  the  CCAF  catalog  for  distribution 
to  Air  Force  schools  and  bases,  recruiters,  high  schools, 
Education  Services  Officers,  college  registrars,  and  re¬ 
questing  employers.  The  catalog  clearly  indicates  who  can 
enter  a  CCAF  program.  It  also  provides  specific  guidance  to 
Education  Services  Officers  concerning  CCAF  programs  and  what 
courses  airmen  should  be  advised  to  take  to  progress  through 
a  degree  program.  The  catalog  clearly  specifies  how  an  air¬ 
man  may  request  a  transcript  or  his  technical  education  com¬ 
pleted  while  in  the  Air  Force  and  how  he  may  have  an  official 
transcript  forwarded  to  employers  or  colleges.  Finally,  the 
catalog  details  the  procedure  by  which  an  airman  may  accum¬ 
ulate  the  necessary  semester  hours  for  an  AAS  and  submit  doc¬ 
umentation  to  CCAF  in  support  of  his  request  for  the  degree. 

ASSOCIATE  IN  APPLIED  SCIENCE  DEGREE  APPROVAL:  Once  the 
airman's  documentation  is  received  at  CCXF (after  review  and 
consolidation  by  Education  Services  Officers) ,  it  is  re¬ 
viewed  by  program  administrators  to  insure  that  courses 
completed  are  from  accredited  institutions  and  are  consist¬ 
ent  with  program  objectives  and  that  the  airman  has  a  coher¬ 
ent  body  of  knowledge  reflecting  a  comprehensive  grasp  of 
his  occupational  specialty.  Airmen  who  fulfill  AAS  require¬ 
ments  will  be  recommended  to  the  Policy  Council  for  approval 
and  award  of  the  Associate  in  Applied  Science  degree. 

FOLLOW-UP i  To  assure  that  the  degree  programs  continue  to 
meet  the  needs  of  the  Air  Force  and,  where  possible,  civil¬ 
ian  employers,  follow-up  studit-  ire  conducted  by  the 
Institutional  Research  Branch  of  the  CCAF  Registrar.  The 
studies  determine  how  CCAF  programs  contribute  to  the 
improvement  of  NCO  quality  in  terms  of  vocational  skills 
and  supervisory  competence  and  how  useful  the  programs  are 
to  employers  and  to  other  agencies  such  as  colleges  and 
universities.  Feedback  from  these  studies,  as  well  as  in¬ 
formal  feedback  from  registrars,  surveys,  and  other  sources, 
enable  us  to  modify  and  improve  the  prograns  as  necessary. 

SUMMARY:  The  systematic  model  which  illustrates  CCAF  pro¬ 
gram"  development  is  educationally  sound.  Occupational 
requirements  are  illustrative  of  needs.  Educational  re¬ 
sources  are  available  to  fill  these  needs  and  CCAF  programs 
result,  Su^lity  control  in  terms  of  substance  and  amount 
occur  as  a  tesult  of  subsequent  reviews  from  the  CCAF  Policy 
Council  and  advisory  panels.  Necessary  modifications  result 
and,  then,  final  approval.  The  resulting  career  programs 
are  published  in  the  CCAF  catalog.  Interaction  between  CCAF, 
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the  airman  and  his  Education  Services  Officer  occurs.  This 
results  in  continuous  vocational  guidance  with  the  objective 
being  receipt  of  the  AAS.  Subsequent  follow-up  of  CCAF 
graduates  is  made  to  determine  how  well  CCAF  programs  have 
prepared  the  individual  for  his  role  as  a  technician  and 
supervisor.  The  follow-up  serves  as  a  feedback  loop  making 
the  CCAF  program  development  process  a  continuous  one. 


DANTES  Discussion  Panel 


ARMY  SKILL  DOCUMENTATION  PROGRAMS 
Lt  Col  Hal  W.  Downey 

1.  Soldier*  receive,  in  varying  degree*,  training  and  experience  in 
*kill«  which  have  counterpart*  in  the  civilian  coaaunity.  A*  a  result, 
■any  soldiers  leaving  the  Aray  are  proficient  In  skills  needed  oy  ci¬ 
vilian  industry  and  could  be  assimilated  into  the  econoay  without 
extensive  training.  Experience  has  shown  that  industry  has  not,  as  a 
general  rule,  accepted  exservlceaecbera  aa  skilled  workers.  Because 
this  attitude  is  costly  to  industry  we  must  assuae  lt  is  based  upon  a 
lack  of  awareness  of  the  value  of  ailltary  training  and  experience 
rather  than  a  reluctance  to  eaploy  veterans  at  an  advanced  starting 
level.  The  Aray,  In  an  effort  to  create  an  awareness  of  the  value  of 
ailltary  training  and  experience  in  civilian  industry,  in  the  early  1970s 
began  to  develop  prograas  to  obtain  recognition  iron  selected  Industries 
for  the  skills  of  soldiers  in  certain  Engineer,  Transportation  and 
Culinary  specialities.  Pilot  prograas  in  apprenticeship,  industry 
recognition  of  non-apprentlcc  -,ble  skills,  and  technical  certification  were 
initiated. 

2.  The  initial  prograas  were  developed  by  Training  and  Doctrine  Coaasnd 
(TRADOC)  Service  Schools,  as  the  technical  expertise  in  the  skill*  the 
prograas  represented  was  in  the  training  coaaittees  of  these  schools. 

There  was  no  central  Aray  control  and  there  vert  no  unifora  procedures 
for  prograa  developaent.  These  latter  facts  caused  concern  at  Head¬ 
quarters,  Departaent  of  the  Army  and  in  late  1974  The  Adjutant  General 
was  given  the  responsibility  for  coordinating  program  developaent  and 
for  prescribing  uni  fora  policies  for  prograa  developaent  and  sumageaent. 

At  this  tine  the  three  pilot  prograas  were  well  underway. 

a.  The  National  Apprenticeship  standards  for  the  Aray  were  developed, 
and  in  July  1975  were  registered  officially  with  the  Bureau  of  Apprentice¬ 
ship  and  Training  (BAT),  US  Departaent  of  Labor,  The  Engineer  School 
Apprenticeship  Prograa,  developed  Jointly  by  the  Departaent  of  Labor, 
the  US  Aray  Engineer  School  and  three  civilian  agencies  (the  International 
Union  of  Operating  Engineer*,  the  Associated  General  Contractors  of 
Aaerica,  and  the  National  Constructors  Assoc  latlon)  was  registered  with 
BAT  in  August  1975  and  becaae  the  Aray's  first  apprenticeship  prograa. 

This  prograa,  consisting  of  four  occupational  skills  (Plant  Equipment 
Operator,  Grading  and  Paving  Equipae.it  Operator,  Universal  Equlpaent 
Operator  and  Heavy  Duty  Repairer  (Construction  Equipamt)}  was  implement  - 
cd  Aray-wlde  in  February  1976.  From  the  initial  pilot  prograa  in  1973 
to  implementation  of  an  Aray  prograa  In  1976,  we  learned  a  great  deal 
about  apprenticeship  prograas,  auch  of  it  by  trial  and  error.  The 
prolonged  develepaental  period  for  these  first  prograas  shortened  the 
overall  developaent  of  the  Aray  apprenticeship  prograa. 


b.  The  Transportation  School  developed  their  pilot  program  to  gain 
recognition  for  enlisted  personnel  who  desire  a  career  in  the  fields  of 
aotor  transport  operations,  Marine  craft  operations  and  maintenance , 
aviation  Maintenance,  or  terminal  operations  and  whose  military  expe¬ 
rience  and  training  has  been  in  theca  fields.  This  progron  was 
developed  in  coordination  with  sons  40  national  end  regional  industries 
which  employ  people  in  the  aforementioned  career  fields.  Industry 
representatives,  after  reviewing  training  programs  of  the  Transportatiou 
School,  indorsed  the  Transportation  Corps  Industry  Accreditation  Program 
(LAP).  The  IAP  is  essentially  a  referral  program  thru  which  soldiers 
about  to  letve  the  Army  prepare  s  brief,  forsuited  resume  of  their  Army 
experience,  list  areas  cf  dsslrsd  employment,  and  are  referred  to 
employers  who  could  provide  them  Jobs.  This  program  will  serve  as  the 
model  for  the  manner  of  developing  Industry  Rscopiitlon  Programs  in 
coordination  with  interested  national/regional  industries;  however,  we 
do  not  anticipate  development  of  any  further  referral  programs. 

c.  The  Quartermaster  School  developad  a  pilot  program  in  the  area  of 
profeaalonai/technlcal  certification  for  food  service  personnel.  They  con¬ 
tacted  national/regional  agencies  which  ware  leaders  in  the  culinary  field 
to  invite  them  to  participate  in  development  of  an  Army  program.  Only 
the  American  Culinary  Federation  (ACF)  had  a  certification  program  and 
desired  to  work  with  the  Army.  The  resultant  program  was  one  which 
outlined  procedures  for  soldiers  to  follow  to  gain  ACF  certification  at 
the  various  levels  of  culinary  expertise  and  provided  for  participants 

to  join  ACF  chapters.  After  much  debate  it  was  finally  determined  that 
the  Army  would  not  sponsor  any  private  association  programs  and  since 
ACF  was  s  private  association  the  pilot  was  not  to  become  au  Army 
program.  Instead,  the  Quartermaster  School  was  to  develop  an  Industry 
Recognition  Program  for  food  service  personnel  as  a  follow-up  to  thslr 
cooks'  apprenticeship  program  or,  an  alternative  form  of  documentat ion 
for  cooks. 

3.  The  Army  set  as  itv  goals,  the  development  of  programs  to  documsnt 
skills  attained  by  soldiers,  in  a  fora  which  would  have  meaning  to 
civilian  industry.  There  would  be  no  alteration  of  Aray  training  or  of 
duty  assignments  simply  to  provide  skills  useful  to  soldiers  in  seeking 
past-service  employment;  there  would  be  no  programs  developad  to  train 
soldiers  in  civilian  skills;  there  would  he  no  programs  daaigned  pri¬ 
marily  to  gain  certification  from  private  associations. 

a.  The  Army  determined  to  target  its  initial  programs  on  enlisted 
personnel,  as  they  coaprlse  the  bulk  of  the  soldiers  leaving  the  service. 

b.  Because  apprenticeship  programs  of  far  a  well  structured  and 
meaningful  form  of  documentation  of  progression  in  skill;,  we  elected  to 
concentrate  first  on  Army  skills  which  had  counterpart  civilian  appren- 
tlceable  skills  and  then  to  move  on  to  other  directly  relatable  skills 
or  to  programs  beyond  apprenticeship,  for  apprentlcesble  skills. 


c.  We  determined  to  develop  some  experimental  programs  to  attempt 
to  relate  aoae  of  the  general  skills  of  personnel  in  specialities  which 
have  no  direct  civilian  counterparts  (e.g.  combat  arms)  to  those  expe¬ 
riences  employers  would  consider  attributes  in  prospective  exployees 
(i.e.,  leadership,  administration,  stocking,  etc.) 

d.  We  will  also,  either  using  the  catalog  developed  by  the  Defense 
Activity  for  Non-Traditional  Education  Support  (DANTES)  or  thru  our  own 
research,  identify  skills  requiring  licensing/certification  as  a  pre¬ 
requisite  for  employment  and  indicate  the  regulating  agencies. 

4.  Artsy  Apprenticeship  Programs  for  Military  Personnel. 

a>  These  are  programs  paralleling  those  in  the  civilian  community, 
developed  for  Army  skills  which  relate  to  civilian  apprentlceable  skills, 
registered  with  the  US  Department  of  Labor  (DOL),  thru  which  participants 
can  achieve  DOL  certification  aa  Journeyman. 

b.  The  TRADCC  Service  Schools  who  have  training  proponency  (respon¬ 
sible  for  the  development  of  training  programs  to  qualify  personnel  in 
the  skills  of  the  MOS)  for  an  enlisted  speciality  (NOS)  design  apprentice¬ 
ship  program*  with  the  assistance  of  the  Bureau  of  Apprenticeship  and 
Training  (BAT),  DOL.  They  send  their  program*  thru  command  channels  to 
BAT  requesting  the  programs  be  registered  under  the  National  Standards 

of  the  Army  and  the  apprenticeship  standards  of  their  school.  After  s 
program  l»  registered  the  school  prepares  s  draft  of  s  DA  Pamphlet  which 
will  Implement  the  program  Army-wide.  Once  a  program  is  implemented,  the 
school  acts  as  technical  advisor  to  Education  Services  Officers  and 
Injures  programs  are  kept  current. 

c.  Army  Apprenticeship  Programs  are  s  part  of  the  Army  Continuing 
Education  System.  They  are  a  means  of  professional  development  for  a 
soldier  and  are  operated  thru  Army  Education  Centers.  Soldiers  register 
with  their  Education  Services  Officer  who  in  turn  sends  registration  data 
to  the  Army  Adjutant  General  Center  (TAGCEN)  vhere  data  is  stored  and 
given  each  month  to  BAT. 

d.  Soldiers  have  Individual  logbooks  which  contain  cumaulatlve 
records  of  work  experience,  validated  by  their  supervisors.  Each  appren¬ 
ticeship  program  has  a  schedule  of  work  processes  in  which  an  apprentice 
must  log  a  specific  number  of  hours  of  satisfactory  work  in  order  to 
roach  Journeyman  status.  Programs  vary  depending  upon  the  complexity  of 
the  skill  from  s  total  2000  hours  of  experience  to  8000  hours  and  for 
each  2000  hours  there  is  a  corresponding  requirement  thst  the  apprentice 
obtain  144  hours  of  related  Instruction  (in  many  cases  Advanced  Individual 
Training  will  satisfy  the  full  program  tequlrement  for  instruction). 

e.  If  a  soldier  logs  all  required  hours  of  work  experience  and 
attends  the  requisite  hours  of  related  instruction  he/she  may  apply  to 

the  local  Education  Services  Officer  (ESO)  for  a  Certificate  of  Completion  of 
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Apprenticeship.  Those  certificates  will  be  issued  by  DOL.  The  EhO 
simply  verifies  the  logbook  entries  of  the  individual  and,  once  the 
entitlement  is  established,  requests  that  TAGCEN  obtain  a  certificate 
fro*  DOL.  TAGCEN  sends  the  signed  certificates  to  the  ESO  for  appro¬ 
priate  presentation. 


f.  Soldiers  who  iesvvi  the  Army  after  having  completed  only  a 
portion  of  the  total  required  hours  of  oq»»rience  and/or  related  instruc¬ 
tion  present  their  logbooks  to  the  ESO  for  verification.  The  ESO  will 
then  issue  a  letter  certifying  the  nuaber  of  hours  coapleted  in  each  work 
process  and  attesting  to  the  fact  the  Individual  was  participating  in  a 
Nationally  registered  apprenticeship  program. 


g.  The  Aray  had  29  programs  Implemented  as  of  30  September  1977,  30 
aore  registered  awaiting  publication  of  DA  Pamphlets  for  implementation, 
and  13  at  BAT  under  review.  This  constitutes  the  Initial  group  of  pro¬ 
grams  to  be  developed.  Additional  programs  will  He  developed  as  new 
areas  are  opened  to  apprenticeship  (e.g.  a  law  enforcement  program  was 

tvCcuily  with  »nT  emttu  CGpiw*  u f  this  prO|tii|  Out*  ined  by 

TAGCEN  in  Sep  77,  were  furnished  to  the  Military  Policy  School  for 
consideration).  Sec  Incl  1  for  program  listing. 


h.  We  believe  that  apprenticeship  programs  will  offer  t ildiere  an 
appealing  aeana  of  professional  development  with  a  clearly  defined 
personal  goal.  As  soldiers  pursue  thia  goal  they  will  become  more 
professional  in  their  NOS  and,  because  moat  all  programu  will  take  more 
than  one  enlistment  to  complete,  there  will  be  sn  additional  incentive 
for  reenllatmerii  for  a  2d  term  of  service.  As  the  value  of  these  pro- 
grama  can  be  documented  (from  surveys  of  soldiers  leaving  the  service 
and  from  evaluations  by  participants)  sbey  should  present  a  valuable 
recruiting  too'.  The  Army  benefits  will  accrue  as  soldiers  strive 
toward  their  individual  goals. 

5.  Industry  Recognition  Programs  HRP) 


a.  These  will  be  programs  developed  in  conjunction  with  employing 
Industries,  to  establish  a  form  of  credential  which  will  adequately  de¬ 
fine  for  potential  employers  an  individual's  level  of  skill  proficiency, 
either  in  a  non-apprent iceable  skill  or  that  achieved  beyond  journeyman 
status  in  an  apprrnt Iceable  skill. 


b.  Current  plana  call  for  the  development  of  !RP"a  to  be  completed 
by  September  1978.  The  Veterans  Employment  Service  of  DOL  has  agreed 
to  assist  the  Army  in  developing  IRP’s. 
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6.  In  suaaary: 

q.  We  knot,  soldiers  possess  valuable  skills,  attained  thru  ail-' 
itary  training  &»d  experience,  which  in  asny  cases  relate  to  occu¬ 
pational  skill*  :n  civilian  industry. 

b.  Our  prograas  are  designed  to  docuaent  this  skill  proficiency  in 
a  fora  which  has  practical  significance  to  potential  employers. 

c.  We  do  k it  alter  training  or  asslgnaent  to  qualify  an  individual 
in  «  civilian  skill,  nor  do  we  run  special  training  progr«as  which 
detract  froa  the  tlm  a  soldier  spends  In  readiness  training. 

d.  Wo  do  not  proaise  post-service  eaployaent  nor  are  we  in  job  place- 
sent.  We  offer  a  voluntary  mans  of  professional  developaent,  as  part 

of  the  Aray  Continuing  Education  Systea. 


IMPLEMENTED  ARMY  APPRENTICESHIP  PROGRAMS 


ENGINEER  SCHOOL 


PROGRAM  TITLE _ 

Plant  Equipment  Operator 

Grading  &  Paving  Equipment  Operator 

Heavy  Duty  Repairer  (Const  Equip) 

Universal  Equipment  Operator  (Const  Equip) 

QUARTERMASTER  SCHOOL 


PROGRAM  riTLE _ 

Cook 

Laboratory  Technician  (petroleum) 

TRANSPORTATION  SCHOOL 


PROGRAM  TITLE _ 

Sheet  Metal  Worker  (Aircraft) 

Electrical  Mechanic  (Acrft) 

Marine  Heavy  Duty  Mechanic  (Hvy  Dty  Mech  -  Diesel) 
Marine  Hull  Repairer,  Ironworker  (Boatbul lder ,  Steel) 
Airplane  Mechanic 


SIGNAL  SCHOOL 


PROGRAM  TITLE _ 

Radio  Communl cat  Iona  Technician 

Aircraft  Electrical  Mechanic 

Central  Office  Telephone  Inataller  &  Repairer 

Automatic  Equipment  Technician 

Radio/TV  Repairer 

Cable  Splicer 


ORDNANCE  SCHOOL 


PROGRAM  TITLE _ 

Small  Weapon^  Repairer 
Artillery  Repairer 
Industrial  Welder 
Machinist 

Automobile  Body  Repairer  4  Painter 
Sewing  Machine  Renal rer 
Automobile  Mechanic 
Truck  Mechanic 

Heavy  Duty  Equipment  Mechanic 


LENGTH 

6000  hours 
•• 

M 

<1 


LENGTH 
6000  hours 

M 


LENGTH 
6000  hours 

It 

8000  hours 

II 

6000  hours 


LENGTH 

8000  hours 
•  « 

II 
•  I 
«• 

II 


LENGTH 
8000  hours 

II 

6000  hours 
8000  hours 

t 

6000  Lours 
8000  hours 

ft 

It 


IMPLEMENTED  ARMY  APPRENTICESHIP  PROGRAMS  (Cortlnued) 


MISSILE/MUNITIONS  SCHOOL 


PROGRAM  TITLE _ 

Electronics  Technician  (Radar) 
Electrical  Instrument  Repairer 
Hydraulic-Equipment  Mechanic 


LENGTH 
7000  hours 
6000  hours 
7000  hours 


(Inclosure  1  ) 


AIT IPJi) INAL  CORRELATES  OF  REENLISTMENT  INTENT  AMONG  WOMEN  IN  THE  ARMY 


Jack  M.  Hicks 

U.  S.  Any  Research  Institute  for  the 
Behavioral  and  Social  Sciences 

INTRODUCTION 

The  investigation  upon  which  this  report  was  based,  grew  out  of  a 
need  for  the  U.  S.  Army  to  better  understand  the  contributions  which 
young  woman  night  potentially  sake  to  its  manpower  system.  This  need 
derived  in  particular  from  the  Tact  that  woman  constitute  SOX  of  the 
general  population,  and  thus  greatly  expand  the  Amy's  potential  tsan- 
pover  source.  The  overall  objective  of  this  effort  was  to  generate  a 
better  understanding  of  the  boat  of  variable*  pertaining  to  the  adjust¬ 
ment  of  enlisted  women  (EW)  in  the  Amy,  including  current  attitudes, 
morale,  perceptions,  motivations,  and  recalls tnent  intent. 

The  present  report  was  generally  responsive  to  current  Amy  con¬ 
cerns  with  respset  to  retention  of  first-term  enlisted  personnel.  More 
specifically,  this  analysis  addressed  the  currently  little  studied 
domain  of  raenliet<aent  Intent  correlates  among  Army  EW.  The  purpose  of 
this  report  was  to  identify  soma  of  the  most  promising  correlates  of 
self-reported  re<ynllstnsnc  intent  among  these  attitudinal  variables 
included  in  the  above  discussed  parent  investigation.  The  long-range 
objective  of  this  research  is  tc  develop  s  profile  of  reliable  correla¬ 
tes  of  resells tt^ot  intont,  spe..ixically  eppllceble  to  EW.  It  is  hoped 
that  ultimately,  such  er.  analysis  might  xarve  as  a  basis  for  an  EW 
reaolistaent  predictor  battery. 

f  IE!  HOD 

>'v*ed  upon  «r<;lininary  interview  and  a  pilot  study,  a  162-item 
questionnaire  designed.  Niue  divlaion-eise  Arny  lnatcilatlon*  with 
s  high  percentage  of  EW  vurf  sampled  for  data  collection,  yielding  a 
total  of  1718  E V  respondents.  To  the  'xtent  possible,  selection  of  EW 
was  governed  by  f,he  following  specifications:  (1)  length  of  service, 

18- 2o  nonets*  (first-term):  (2)  at  Isait  ')  months  from  aspiration  of 
term  of  service  (‘TS);  (3)  pri>w,vily  E-3  and  E-4;  (4)  eligible  lor 

reenli*t?se*ii ;  aad  <5)  diveroiry  of  MDS. 

The  r  aest'-otmaiti  vs*  admintstered  oy  civilian  uata  collection 
tears  to  coll* ,;(.ivea  o(  Zrj  to  >0C  EW  per  ntssion,  in  December  1975. 
Confidentiality  and  anonymity  were  a**urva.  Tfs*  verbatim  instructions 
were  read  aloud,  wi  .h  the  •  sspc.wieits  foiluviag  silent?/  Iroi,  their 
(*ufc8tioneai  r*.  ooonlets.  /  one  nnut  verier  proved  sufficient  to  complete 
the  questionnaire. 


RESULTS 


IVen-y-fivt  percaut  of  the  EV  indicated  that  they  would  definitely 
or  probably  reenlist  at  ETS.  An  additional  26JC  indicated  that  they 
eight  or  eight  not  reenlist,  with  the  remaining  49X  reporting  that  they 
would  definitely  or  probably  not  reenlist. 

Enlistment  Considerations.  The  interpretation  of  relationships  be¬ 
tween  initial  enlistment  motivations  and  reenlistment  intent  should  take 
into  account  that  a  minimum  of  18  months  had  passed  since  enlistment, 
and  that  ETS  was  still  perhaps  18  months  in  the  future.  Such  time  lags 
might  be  expected  to  spuriously  deflate  whatever  true  relationships 
exist.  Nonetheless,  several  statistically  significant  correlations 
resulted.  The  statistical  tests  employed  were  chi  square  tc  establish 
significance/noaslgniflcance,  and  Cramer's  V  to  estimate  strength  of 
relationship.  Of  the  21  most  frequently  reported  enlistment  motivations, 
5  were  found  to  be  significantly  associated  with  reenlistment  intent. 

Of  some  interest  was  the  fact  that  the  most  frequently  reported  enlist¬ 
ment  motives  were  not  those  which  proved  to  be  most  predictive  of  reen¬ 
listment  intent.  In  fact,  the  most  popular  enlistment  motive,  "to  get 
college  benefits,"  was  not  significantly  correlated  with  reenliatment 
inteat.  The  next  most  often  given  reasons  for  enlistment,  "to  get 
civilian  Job  training,"  "to  traval,"  and  "to  find  adventure  and  excite¬ 
ment,"  were  significantly,  but  modestly  associated  with  reenliatment 
intent.  The  sttongest  correlates  of  reenlistment  intent  came  from  less 
often  reported  motivations  such  as,  "to  serve  my  country"  (V  •  .21), 
and  "came  from  a  military  family"  (V  •  .16). 

Also  worthy  of  mention  is  the  significant  correlation  batveen 
"length  of  time  spent  thinking  about  enlistment"  and  reenlistment  intent 
(V  «  .24).  In  this  esse,  those  who  lndlcatsd  a  llklihood  of  reenllst- 
ment,  were  significantly  less  inclined  to  say  that  they  "joined  03 
impulse,"  than  those  who  reported  little  propensity  to  reenlist. 

Satisfaction  with  the  Army.  The  strongest  association  found  iii 
the  entire  analysis  was  between  "overall  satisfaction  with  the  Army" 
and  reenllstaent  intent  (V  -  .56).  Approximately  half  of  those  in¬ 
clined  to  reenlist  expressed  high  satisfaction  with  the  Army  overall, 
as  compared  to  only  62  among  whose  disinclined  to  reenlist.  Only  3X 
of  those  with  reenliatment  plans  indicated  low  satisfaction  with  the 
Army. 

Almost  as  strong  a  relationship  was  yielded  between  "work  satis¬ 
faction  in  the  Army"  and  reenlistment  intent  (V  •  .47).  As  many  as 
SIX  of  those  Intending  to  reenlist  reported  good  to  excellent  work 
satisfaction.  Only  39X  of  those  leaning  against  reenlistment  reported 
a  similar  level,  of  work  aatisfaction.  This  finding  was  in  direct 
corroboratiou  of  the  primary  finding  in  a  very  recent  report  on  job 
satisfaction  and  reenliatment  intent  among  enlisted  men  (Goldman  and 
Worstine,  1977). 

Interest  in  Combat.  A  major  current  issue  regarding  the  util  ill¬ 
ation  of  women  in  the  military  concerns  their  potential  for  combat  >IOS, 
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free  which  they  ere  excluded  et  present.  One  of  the  strongest  correla¬ 
tion*  obtained  in  this  analysis  was  that  between  "interest  in  combat" 
and  reenlistaent  intent  (V  *  .35).  Approximately  one  half  of  those 
women  Interested  in  reenlistment  also  expressed  an  interest  in  combat. 

By  contrast,  less  one  in  five  of  those  not  intending  to  reenllst 
reported  a  similar  interest  in  combat.  This  finding  is  symptomatic 
of  a  general  pattern  inherent  in  this  body  of  data,  which  suggests  that 
women  prone  to  reenlist  are  more  inclined  to  feel  that  they  can  and 
should  dc  uen's  work,  titan  women  not  interested  in  ryenliatment. 

Perceived  Quality  of  Personnel.  Among  the  strongest  correlates  of 
reenlistaent  intent  for  enlisted  women,  was  their  perception  of  the 
quality  and  desirability  of  their  Lale  counterparts.  Approximately  1/3 
of  the  women  intending  to  reenlist  reported  the  overall  quality  of  the 
uen  to  be  good  to  excellent,  as  compared  to  13X  for  those  planning  not 
to  reup.  (V  -  .27).  Almost  one-half  of  those  not  intending  to  reenllst 
considered  the  men  fair  to  poor,  as  compared  to  less  that  1/A  of  those 
planning  to  reenllst  who  felt  this  wey.  A  similar  result  was  obtained 
in  connection  with  the  perceived  liklihood  of  marrying  an  Army  man 
(assuming  single  ststus).  ?ully  1/2  of  the  rcenlistment  prone  respond¬ 
ents  Indicated  that  thay  would  be  inclined  to  marry  an  Army  man,  as 
compared  to  approximately  1/A  of  those  not  reenlistment  prone  (V  -  .27). 
Much  the  seme  results  ware  obtained  from  the  inclination  to  date  en 
Army  men,  as  related  to  reenlistment  intent  (V  -  .22),  and  the  perceived 
overall  quality  of  women  in  the  Amy,  as  related  to  reeolJ  Jtment  Intent 
(V  -  .21). 

Likes  end  Dlsllkea.  Finally,  the  reapondaita  were  asked  to  Indicate 
the  thing*  they  liked  and  disliked  most  shout  the  Army,  In  terms  of 
relationships  to  reenlistment  intent,  the  strongest  correlate  among 
these  stimuli  was  "Army  tradition"  (V  •  .32).  Almost  2/3  of  the  re- 
enlietaent  prone  respondents  reported  e  liking  for  Army  tradition,  as 
compared  to  less  then  1/3  for  those  not  reenlistment  prone.  A  similar 
relationship  (V  *  .32)  was  obtained  between  "rules  and  regulations"  and 
reenlistaent  intent.  Other  significant  correlates  of  reenlistment  intent 
were  "field  duty"  (V  -  .30),  "drees/hair  regulations"  (V  »  .26),  "the 
SCO’s"  (V  -  .21),  "dress  uniforms "  (V  «  .20),  "the  food"  (V  *  .20), 
"comissioned  officers'  (V  -  .IS),  "this  Post"  (V  -  .10),  "my  HOS" 

(V  -  .18),  end  "fatigue  uniforms"  (V  •  .18).  In  all  of  these  instances, 
the  greater  the  liking,  the  greater  Lite  expressed  intent  to  xeenllst. 

Somewhat  consistent  with  the  above  findings  concerning  quality  of 
personnel,  liking  of  "most  men  ;vn  the  Army"  correlated  more  highly  with 
reenlistaent  Intent  (V  «=  .16)  than  liking  of  "most  women  in  the  Army" 

(V  ■  .03).  In  both  cases,  however,  the  Vs  were  much  lower  than  found 
in  regard  to  quality  of  personnel.  It  was  also  of  interest  that  the 
relationship  of  "heavy  lifting"  and  reenlistment  intent,  though  statis¬ 
tically  significant  (V  •  ,15),  was  not  among  the  strongest  predictors 
of  reenlistaent  intent.  Other  poor  correlates  were  like/dislike  of 
"the  barracks"  (V  -  .10)  and  "abuse  from  civilians"  (V  *  .0A), 
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In  summary,  there  appear  to  bet  on  the  basis  of  this  analysis  a 
substantial  number  of  promising  variables  which  might  serve  as  a  basis 
f. or  a  first-term  female  reenlistment  predictor  battery.  A  key  assump¬ 
tion  relevant  to  this  objective,  of  course,  is  a  high  relationship 
between  reealis tnect  intent  end  actual  reenlistaent  behavior.  On  the 
basis  of  previous  research  old  and  new,  such  an  assumption  would  appear 
to  be  reasonably  sound.  According  to  the  most  recent  research,  the 
Air  Force  Human  Resources  Laboratory  (Guinn,  Berberich,  and  Vitola, 

197?)  found  that  92Z  of  those  first-teraers  e.  Dressing  an  interest 
in  a  military  career  actually  did  reenllst;  wharves,  932  of  those  who 
expressed  disinterest,  did  not  reenlist.  Goldman  and  Wore tine  (1977) 
also  reported  that  "...soldiers’  statements  regarding  r. ’enlistment 
intent  are  highly  correlated  with  actual  reenlistuent  decision..." 

Of  course,  these  relationships  tend  to  hold  most  strongly  the  less  the 
time  lag  between  expressed  reenlistment  Intent  and  actual  reenlistment. 
Nonetheless,  there  appears  ro  be  sufficient  justification  for  the 
assumption  that  a  high  relationship  exists  between  reenlistuent  intent 
and  actual  reenlistment  such  that  the  present  analysis  might  serve  as  a 
useful  preliminary  exploration  of  the  feasibility  a  predictor  of  re¬ 
enlistuent  decision  for  women  in  the  Army. 
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TEW  OF  SERVICE  AND  REBUSMENT  INTENT 


BEdEJML 


TERM 

Will 

MAYBE 

WONT 

TOTALS 

FIRST 

573 

4G7 

773 

3553  (910 

NOT  FIRST 

m 

35 

73 

153  (90 

TOTALS 

'll? 

m 

8% 

1706 

(25X) 

(280 

(490 

CHI  SQUARE  »  1,78 
p>  .05 
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MOTIVATIONS  Aft)  REEMJSTMENT  INTENT 


re-ip.  imr. 

ENUSTJENT  MOTIVATION 

WILL 

WONT 

P 

GET  COULEE  BENEFITS 

542 

572 

NS 

GET  CIVILIAN  JOB  TRAINING 

53Z 

422 

<.0001 

TRAVEL 

502 

412 

<.01 

ADWENFJRE  EXCITEfENT 

482 

372 

4.001 

NEEDED  THE  KflEY 

2531 

252 

NS 

NO  CIVILIAN  JOB 

172 

202 

NS 

SERVE  COUNTRY 

272 

112 

4.001 

GET  AWAY  FROM  IOE 

132 

172 

NS 

GET  AWAY  FROM  SMALL  TOWN 

132 

132 

NS 

DIDN'T  LIME  JOB 

112 

122 

NS 

CME  FROM  MILITARY  FAMILY 

132 

52 

4.001 

OVERALL  CHI  SQUARE  =  77 


TABLE  3 


LB m  CF  Tift  THOUGHT  ABOUT  ENLISTING 
AS  RELATED  TO  REENLISTTtNT  INTENT 
RE-4P  INTENT 


LENGTH  OF  Tift 

Hill 

WONT 

NOT  LONG,  JOINED  ON  INUSE 

21% 

36% 

THOUGHT  ABOUT  IT  SEVERAL  fUITHS 

32% 

36% 

THOUGHT  ABOUT  IT  A  TEAR  OR  SO 

24% 

21% 

kanb  in  m  since  m  young 

22% 

7% 

CHI  SQUARE  -  70 

PC.0Q1 

V*  ,235 
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TABLE  4 


mm.  satisfaction  m  m 

AS  BELATED  TO  REENLISTTtNT  INIENT 

BE4EJMT 


SATISFACTION 

WILL 

MONT 

HIGH 

435 

62 

MEDIUM 

482 

502 

UOK 

52 

432 

CHI  SQUARE 

•396 

p  <  .001 

V-  .559 
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TABUE  5 

WORK  SATISFACTION  IN  ARW 
AS  RELATED  TO  REENUSTNINT  INTENT 
KtfJM 


WORK  SATISFACTION 

Will 

WONT 

EXCELLENT 

38X 

8* 

GOOD 

43* 

312 

AVERAGE 

16Z 

312 

FAIR 

22 

152 

POOR 

12 

142 

CHI  SQUARE  *  26 

p<  .001 

V-  .<168 


TABU  6 


INTEREST  IN  COMBAT  AS  RELATED 
TO  REENLISITCNT  INTENT 
*  BEdEJML 


INTEREST  IN  COMBAT 

Will 

WONT 

A  LOT 

23X 

67. 

m. 

26X 

12X 

NOT  VEKY  MUCH 

21Z 

17X 

NOT  AT  All 

30 X 

643 

CHI  SQUARE  « 158 
p  <.OQl 
V-  ,554 


TABLE  7 

PERCEIVED  CHWCES  OF  LEARNING  CIVILIAN  JOB  SKILL 
AS  RELATED  TO  REENLISTMENT  INTENT 


BEdLiffll 


CHANCES  OF  SKILL 

WILL 

WONT 

EXCELLENT 

331 

1ft 

GOOD 

41Z 

2ft 

AVERAGE 

1ft 

2ft 

FAIR 

ft 

12X 

POOR 

ft 

2K 

CHI  SQUARE  »  145 

p  <,001 

V-  .539 


279 


TABLE  li 


PERCEIVED  CMERAU  QUALITY  OF  HEN  AS  BELATED  TO 
REENLISTTtNT  INTENT 
RE-OP  INTENT 


QUALITY 

WILL 

WONT 

EXCELLENT 

32 

22 

GOOD 

29Z 

112 

AVERAGE 

432 

392 

FAIR 

m 

272 

POOR 

82 

212 

Oil  SQUARE  *  94 
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PERCEMS  LIKLIHOQD  OF  MARRYING  AN  AIW  ^  IF  SINGLE; 
AS  RELATED  TO  REENUSIMENT  INTENT 
BEdEJM 


wmmm\ 

WILL 

WONT 

DEFINITELY  WOULD 

23X 

131 

PROBABLY  WOULD 

28X 

13X 

MIGHT/MiGHT  NOT 

3t K 

36X 

PROBABLY  WOULDN'T 

8X 

16X 

DEFINITELY  WOULDN'T 

m 

215 

CHI  S3UARE  «  89 

p  <  .001 

V-  ,266 
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TABLE  ID 

PERCEIMED  LIKLIHOOD  OF  DATING  AN  AWtf  HSN  IF 
SINGLE,  AS  RELATE)  TO  REENUSTTfNT  INDENT 
BEdEJMI 


DATE  AiflY  MAN 

WILL 

WONT 

DEFINITELY  WOULD 

341 

21* 

PROBABLY  MGULD 

37X 

28* 

MIGKT/T1IGHT  NOT 

21* 

34Z 

PROBABLY  WOULDN'T 

4* 

8* 

DEFINTELY  WOULDN'T 

3* 

9X 

CHI  9QUARE  *  65 

p<  .001 

V-  ,225 
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TABLE  II 

PERCEie  OVERALL  QUALITY  OF  VO€N  AS  RELATED  TO 
REENLISRENT IKIENT 
RLdP.jmr 

QUALITY  WILL  WOHT 

EXCELLENT  61  2 


TABLE  12 


EdOflSH 

STIMULUS  WILL  WONT 


DISLIKE 

ARWif  'TRADITION*  ifiir 

LII€ 

26X 

63Z 

55* 

322 

V-  ,325 

„.r,  DISLIKE 

RULES  AND  REGS 

29X 

582 

V-  ,322 

LIKE 

5SX 

282 

„„„„„  DISLIKE 

FIELD  DUTY 

44Z 

m 

V-  ,296 

LIKE 

455 

202 

DRESS/HAIR  REGS  D'^ll£ 

44z 

682 

V  »  .257 

Lilt 

455 

222 

THE  NOO'S  “ 

131 

292 

V-  .206 

LIME 

M 

555 

DISLIKE 

DRESS  UNIFORMS  f  ^ 

m 

45K 

V-  .205 

LIKE 

635 

41X 

M  M  DISLIME 

THE  RXD 

47X 

652 

V-  ,205 

LIKE 

mx 

252 

,«irpnr  DISLIKE 

OFFICERS 

m 

3jZ 

V-  ,194 

LIKE 

69Z 

512 

THIS  POST  “ 

46Z 

62* 

V  «  .186 

LIKE 

47X 

282 

w «  MSUIE 

Fu  rub  t  _ 

1SZ 

331 

V-  .185 

LIME 

75X 

555 

dislime 

FATIGUES 

342 

512 

V*  ,183 

LIKE 

59* 

40* 
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MALE/FCiALK  ATTITUDES  RELATED  TO  PERFORMANCE 
IN  AIR  FORCE  TECHNICAL  TRAINING 


Jeffrey  E.  Kantor,  Bart  M.  VI tola,  and  Nancy  Guinn 
Paraonnal  Research  Diviaion 
Air  Force  Hunan  Raaourcaa  Laboratory 
Brooks  Air  Force  Beat,  Texas 

At  the  request  of  the  USAF  Air  Training  Co—nnd ,  the  Paraonnal 
Raaaarch  Diviaion  of  the  Air  Force  Hunan  Raaourcaa  Laboratory  initiated  a 
research  project  designed  to  investigate  student  attitudes  toward  Air 
Force  technical  training  and  the  relationship  between  those  attitudes  and 
perfonunce/attrltxon  in  technical  training.  This  research  was  divided 
into  three  aaln  phases t  (1)  the  development  and  validation  of  an  instru¬ 
ment  both  sensitive  to  student  attitudes  and  related  to  technical  training 
performance,  (2)  a  comparison  of  student  attitudes  from  courses  having 
differential  attrition  rates,  and  (3)  an  assessment  of  the  relationship 
between  attitude  and  performance  in  general  and  attitude  and  performance 
within  specific  subgroups  of  interest. 

The  first  phase,  development  and  validation  of  the  Technical  Training 
Student  Survey  (TTSS)  was  completed  in  early  1977  (Kantor,  VI to la,  4  Guinn, 
1977),  and  the  remaining  phases  are,  at  present,  nearing  completion.  In 
the  course  of  this  effort,  a  data  base  was  established  consisting  of 
attitudinal  responses  and  technical  training  course  performance  measures 
on  12,667  technical  training  students.  From  this  data  base,  it  is  possible 
to  abstract  and  study  various  subgroups  of  interest.  Altitudinal  differ¬ 
ences  between  these  groups  can  be  identified  and  relationships  between 
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attitude  and  performance  can  be  compared.  In  thia  study,  the  subgroups 
choeen  for  comparison  were  male  and  female  students.  Comparisons  drawn 
between  malea  and  females  are  of  interest  for  several  reasons.  First, 
aale/female  differences  have,  historically,  been  an  area  of  both  scientific 
and  popular  inquiry.  Second,  with  the  increase  in  numbers  of  women  enter* 
ing  the  Air  Force,  and  the  military  in  general,  it  has  become  important 
to  identify  and  assess  male/female  differences  which  might  impact  no 
personnel  training  and  utilisation.  Finally,  in  many  technical  training 
areas,  particularly  Mechanical  and  Electronic,  males  and  females  exhibit 
differential  attrition  rates  unrelated  to  entering  aptitude  scores. 
Therefore,  the  objectives  of  this  study  were  to  (a)  identify  attltudlnal 
differences  between  male  and  female  students  regarding  Air  Force  tech¬ 
nical  training  and  (b)  compare  and  contrast  the  relationships  between 
attitudes  and  performance  for  male  and  female  technical  training  students. 


Method 

Subjects.  A  total  of  12,667  nonprior  eervlce  enlisted  accessions 
(10,980  males  and  1  ,687  females)  were  administered  the  TTSS  while  attending 
one  of  53  different  Air  Force  technical  training  courses  conducted  between 
September  1974  and  August  1975.  For  comparative  purposes,  this  sample 
was  subdivided,  based  upon  sex  and  performance  in  technical  school,  into 
four  groups:  (1)  Male  Graduates  (9,984),  (2)  Male  ElimlnecA  (996),  (3) 
Fesuie  Graduates  (1,430),  and  (4)  Female  Elisdneus  (257). 

Survey  Instrument .  The  TTSS  contains  121  items  designed  to  tap 
student  attitudes  about  specific  aspects  of  the  Air  Force  technical 
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training  experience.  These  measures  reflect  the  student's  expectations 
about  training;  Motivation  for  training;  perceptions  of  instructors, 
fellow  students,  and  physical  settings;  degree  of  perceived  stress  in 
training;  and  the  degree  of  personal  satisfaction  derived  fron  the 
student's  training  and  career  choice.  Approximate  testing  tine  for  the 
TTSS  is  30  Minutes.  A  copy  of  the  TTSS  is  presented  in  Appendix  A.  An 
exanple  of  the  type  of  item  and  response  format  used  is  presented  in 
Figure  1. 

Survey  Administration.  The  TTSS  was  Administered  under  standardised 
conditions  to  students  in  the  training  setting.  Sampling  points  were 
chosen  to  allow  comparisons  across  all  technical  training  courses, 
between  technical  training  centers,  and  between  courses  having  differing 
attrition  rates.  It  is  assumed  that  the  response  patterns  obtained 
accurately  reflected  the  spectrum  of  attitudes  present  in  the  population 
of  Air  Force  technical  training  students. 

Statistical  Analysis.  To  evaluate  nale/female  and  graduate /elimlnee 

i 

differences,  a  stepwise  dlscrlsdnant  analysis  approach  was  utilised. 

f 

This  technique  provided  both  an  identification  of  specific  differential  ; 

attitudes  and  a  relative  importance  weighting  of  those  differences.  Since 
only  two  groups  were  compared  at  any  one  time,  this  approach  la  analogous 
to  a  multiple  linear  regression  with  dichotomously  coded  dependent  variables. 

Error  rate  (Type  1)  was  controlled  per  family  of  stepwise  comparisons  such 
that  the  total  a  for  each  set  of  comparisons  <  .05. 


Res ulf  and  PjacuMlon 


The  first  analysis  was  accomplished  to  identify  attltudinal  differ* 
ences  between  male  and  female  students.  For  this  analysis,  sex  was  the 
dependent  variable  and  significant  relationships  were  identified  between 
the  sex  of  the  respondent  and  his  or  her  responses  on  33  of  the  121  items. 
These  33  items  accounted  for  9. ASX  of  the  dependent  variance.  Based  upon 
the  item  content  and  the  relative  weight  of  that  item  in  the  discriminant 
function,  the  major  attltudinal  differences  between  males  and  females  were 
summarised  and  are  presented  in  descending  order  of  importance  in  Figure  2. 
(A  complete  list  of  the  33  items  end  their  correlations  with  the  dependent 
variable  is  presented  in  Appendix  B.l.) 

From  these  attltudinal  differences,  a  few  general  findings  seem 
apparent.  Females  show  more  concern  about  academics  (l.e.,  desire  more 
off *duty  study  time,  desire  more  time  be  spent  on  difficult  subject 
matter).  This  is  possibly  related  to  their  higher  attrition  rate  from 
technical  training  schools  (males  •  8.98X;  females  •  15.23X)  but  nay 
reflect  a  desire  to  perform  up  to  standards  even  If  additional  time  and 
effort  are  required.  Females  are  less  satisfied  with  certain  aspects  of 
the  physical  environment  (classroom  temperature,  dorm  sleeping  facilities) 
but  have  a  more  positive  perception  of  their  fellow  students  (less  petty 
quarrels,  more  support).  Finally,  although  females  seen  more  happy  with 
their  military  status  (more  satisfied  with  the  Air  Force,  less  bothered 
by  military  bearing) ,  it  is  the  aalas  who  felt  that  technical  training  had 
been  a  more  beneficial  experience.  Overall,  it  appears  that  the  females 
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evidenced  more  academic  difficulty,  more  group  cohuion,  more  satis f action, 
but  perhaps  wore  1ms  sure  of  what  benefit  they  were  getting  out  of 
training.  These  attitudes  night  be  considered  typical  of  those  of  a  group 
entering  into  a  new  environment ,  and  it  is  possible  that  as  the  nuabers 
and  experiences  of  feaales  in  technical  training  increase,  sons  of  the 
aale/feaale  differences  will  be  node rated  and  the  similarities  increased. 

To  differentiate  between  the  attitudes  of  aala  graduates  and  aale 
eliainees,  an  analysis  was  accomplished  using  the  10,980  aala  subjects 
with  graduatlon/ellainatlon  being  the  dependent  variable.  Significant 
relationships  wars  identified  between  the  dependent  variable  and  respoosM 
on  22  of  the  121  TTSS  items  accounting  for  9.76X  of  the  dependent  variance. 
(A  coaplate  list  of  these  items  is  provided  in  Appendix  B.2.)  The  major 
altitudinal  differences  between  aale  graduates  and  e limine as  are  summarised 
in  Figure  3. 

From  these  attltudlnal  differences,  it  would  appear  that  aale 
eliainees  felt  more  stress  (pressure  for  perfection,  difficulty  with 
materials,  interference  with  studies),  that  aale  graduates  placed  more 
Importance  on  system  rewards  (job  security,  avoidance  of  duties),  and 
that  both  male  graduates  and  eliainees  held  some  negative  feelings  about 
each  other.  Overall,  it  might  be  that  the  aale  e limine a  evidences  more 
suaceptability  to  pressure,  less  personal  activation,  and  less  affinity 
for  inherent  system  reinforcers.  This  aakes  the  elimlnee  easily  discouraged 
sad  very  difficult  to  keep  on  track  and  working  when  arduous  effort  is 
required . 
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To  differentiate  between  the  attitudes  of  female  graduates  and 
eliainees ,  an  analysis  was  accomplished  using  the  1 ,687  feaale  subjects 
again  with  graduation/ e liminat ion  being  th«  dependent  variable.  Signifi¬ 
cant  relationships  were  identified  on  12  of  the  121  lteoa,  accounting  for 
11.52X  of  the  dependent  variance.  The  »ajor  attltudlnal  differences 
between  feaale  graduates  and  eliminees  are  summarised  in  Figure  4.  (A 
complete  list  of  the  12  items  is  presented  in  Appendix  B.3.) 

From  these  attltudlnal  differences,  it  would  appear  that  female 
eliminees  also  felt  more  stress  (pressure  for  perfection,  difficulty 
with  course  materials,  student  workload),  that  female  graduates  were  more 
motivated  (desire  mors  study  time,  more  time  on  equipment),  and  that 
female  graduates  placed  more  importance  on  system  rewards  (job  security, 
off-duty  privileges).  Again,  like  the  males,  it  would  appear  that  the 
female  eliminee  evidences  more  acceptability  to  pressure,  less  drive 
towards  the  goal,  and  might  be  difficult  to  motivate  since  she  appears 
less  sensitive  to  system  relnforcers. 

The  major  attltudinAl  factors  found  related  to  graduatlon/e liminat ion 
for  males  and  females  are  summarised  and  coapared  in  Table  1.  It  would 
appear  evident  that  considerable  overlap  exists  between  the  factors 
associated  with  technical  training  performance  for  awles  and  females. 

Out  of  the  first  five  most  important  factors,  four  are  shared  by  both 
males  and  females,  leading  to  the  concluaion  that  the  tlmilarif'ss  out' 
weigh  the  differences  between  the  sexes.  However,  the  differences  which 
do  exist  appear  to  point  to  the  concluaion  that  famalas  have  eoamvhat 
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•ore  academic  difficulty  than  males.  Sine*  all  studanta  entering  any  par* 
ticular  training  course  are  qualified  for  that  course  and  have  generally 
comparable  aptitude  scores,  this  finding  is  interesting  because  it  suggests 
a  difference  in  ability  not  currently  being  asasured.  Several  areas  of 
future  research  are  therefore  suggested.  First,  it  should  be  determined  if 
the  relationships  between  aptitude  test  scores  and  performance  in  technical 
school  arc  tha  sane  for  both  aalcs  and  feaales.  Second,  course  aaterials 
and  structure  should  be  investigated  for  sex  bias  which  night  negatively 
lap act  on  feaale  performance.  Finally,  the  Air  Force  selection  and  classi¬ 
fication  system,  developed  on  a  primarily  all  male  force,  should  be 
evaluated  to  ensure  that  feaales  are  being  properly  managed  with  respect 
to  the  maximally  effective  classification  of  feaale  personnel  and  their 
assignment  to  areas  wherein  they  will  have  the  highest  probability  of 
success. 

Conclusion 

'(The  aale  raid  feaale  attitudes  regarding  the  Air  Force  technical 
training  experience  were  found  to  differ  significantly  in  several  areas. 
Some  of  these  differences  (e.g.,  classroom  temperature)  may  be  dealt  with 
directly,,  but  most  appear  to  be  reflecting  the  differences  in  attitudss 
between  s  group  with  experience  in  s  particular  environment  (males)  versus 
those  of  a  group  entering  a  new  experience  (female^) .  It  la  possible  that 
aa  the  *  ‘newness ’ *  of  having  large  numbers  of  females  in  technical  train* 
ing  wears  off,  the  similarities  befween  male  and  female  students  will 
Increase.  The  similarities  between  factors  associated  with  graduation/ 
elimination  for  aalea  and  feaales  are  substantial  sod  appear  to  indicate 
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similar  problems  in  eliminees  of  both  sexes,  however,  some  differences 
were  noted  and  appear  to  be  indicative  of  females  having  more  academic 
difficulties.  In  suanary,  certain  attitudinal  differences  do  exist  between 
males  and  females  in  Air  Force  technical  training,  but  there  was  substantial 
commonality  indicating  similar  perceptions,  concerns,  and  a  similar 
relationship  between  attitude  and  performance. 
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TTSS  items  significantly  relcted  to  sex  of  respondent  coding:  Males  »  1, 
Females  ■  2:  Item  options  coded  as  per  Appendix  A. 


Item  t 

Correlation 

Item  # 

Correlation 

111 

-.117 

75 

-.034 

97 

-.111 

82 

.064 

51 

i 

• 

o 

v/» 

104 

-.079 

84 

-.049 

66 

-.025 

98 

-y 

o 

rH 

1 

4(2) 

-.035 

49 

.059 

119 

-.005 

113 

-.095 

1)2 

1 

o 

o 

11/ 

.038 

25(1) 

.018 

120 

-.063 

54 

.050 

109 

.007 

59 

-.036 

2(2) 

-.054 

62 

-.030 

19(1) 

.029 

56 

-.058 

115 

o 

co 

38 

.025 

29 

-.054 

88 

.073 

110 

-.069 

3(1) 

,rm 

69 

-.054 

118 

-.054 

Note .  Items  are 

listed  in  order  of 

entry  into  the 

stepwise  discriminate 

ana) ys Is . 
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APPENDIX  B  ,2 


TTSS  items  significantly  related  to  graduation/elimination  of  male 
students  coding:  Graduation  ■  0,  Elimination  *  1. 

Item  #  Correlation 


72 

.182 

1(2) 

-.124 

17(2) 

-.081 

23(1) 

.074 

95 

.128 

47 

-.058 

52 

.061 

74 

-.020 

29 

1 

o 

88 

.046 

80 

.099 

no 

.031 

12(2) 

-.003 

79 

.139 

13(2) 

-.082 

51 

-.032 

70 

.097 

3(1) 

.019 

82 

.109 

89 

.041 

69 

-.035 

84 

.031 

Note.  Items  are  listed  in  order  of  entry  into  the  stepwise  discriminate 
analysis. 
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APPENDIX  B.3 


TTSS  items  significantly  related  to  graduation/eliminution  of  female 
students  coding:  Graduates  -  0,  Eliminees  =  .1, 


Item  # 

Correlation 

72 

.209 

111 

.086 

95 

.154 

1(2) 

-.117 

23(1) 

.121 

8z 

.142 

89 

.049 

9(2) 

-.056 

80 

.134 

84 

.004 

62 

.009 

33 

.117 

Note ,  Items  arc  listed  in  order  of  entry  into  the  stepwise  discriminate 
analysis . 


ARHY  COMMANDERS  AS  GATEKEEPERS  FOR  INFORMATION 

David  L.  Kannaman 
Dr  Robert  Pulliam 


ABSTRACT 


The  Army's  Chief  of  Public  Affairs  performed  a  study  of  company- 
level  command,  to  determine  the  effects  of  "gatekeepers"  on  the  flow 
of  command  information  from  DA  to  soldiers  in  the  field.  Commanders 
and  their  staffs  in  102  representati ve  companies  were  surveyed,  using 
a  structured  interview  analyzed  by  Q-sort  techniques.  Findings 
confirmed  the  critical  '■ole  of  commanders  and  of  other  media  in  deter¬ 
mining  how  soldiers  perceive  the  Army  and  its  mission,  and  revealed 
interesting  characteristics  of  modern  small-unit  command.  Commanders 
were  found  to  be  primary  gatekeepers  for  only  30X  of  the  Information 
flow  studied.  Films  appeared  to  be  less  effective  as  media  than  is 
commonly  believed,  and  limits  of  enlisted  reading  skill  appeared  to 
be  a  critical  constraint  on  choice  of  media. 

The  survey  technique  used  is  of  technical  Interest  because  of  its 
utility  for  further  research  in  the  Services. 
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INTRODUCTION 


There  has  been  some  kind  of  a  formal  discipline  in  military 
psychology  for  aDout  50  years.  Yet  after  50  years  we  are  not  very 
close  to  an  understanding  of  the  dynamics  of  unit  behavior  nor  of 
command.  Alexander  the  Great  probably  knew  more,  intuitively,  about 
the  psychology  of  command  than  we  yet  have  learned  from  research. 

My  colleagues  and  I  therefore  continuously  search  for  research 
tools  by  which  wc  can  measure  and  describe  the  relationship  between 
command  and  unit  behavior.  And  to  us,  one  of  the  most  promising  tools 
is  the  study  of  the  flow  of  information.  This  applies  equally  to  the 
formal  flow  of  orders,  regulations,  or  combat  intelligence,  and  to 
the  informal  flow  of  private  information  and  unofficial  working 
communications.  It  applies  equally  to  military  units  in  combat,  and 
to  housekeeping  duties  In  garrison. 

We  were  therefore  particularly  pleased  when  the  Army  selected 
Kinton,  Inc.,  to  study  the  flow  of  command  Information  at  the  company 
level.  Today  we  will  report  that  study  briefly.  We  will  describe 
the  requirement,  outline  our  method  and  report  some  highlights  of 
the  findings.  We  believe  the  methods  used  have  wide  applicability  in 
military  research  for  addressing  other  problems. 

But,  first,  we  wish  to  express  our  appreciation  to  the  many 
persons  in  the  Army  who  contributed  to  this  research,  and  without 
whose  assistance  the  study  would  not  have  been  possible.  First,  we 
would  like  to  recognize  the  support  of  Colonel  Ralph  E.  Ropp  and 
Captain  Carroll  W.  Williams  who  recognized  the  requirement  for  this 
research  and  who  contributed  substantially  to  its  design  and  direction. 
We  are  further  Indebted  to  the  Public  Affairs  Officers  at  Forts 
Belvolr,  Benning,  Bragg,  Dlx,  Hood,  Lee.Meade  and  Polk,  and  to  members 
of  their  staffs  for  untiring  work  in  scheduling  and  arranging  Inter¬ 
views.  Several  valuable  observations  were  first  offered  by  senior 
coomanders  and  were  later  confirmed  by  company  level  interviews. 

Finally,  we  are  most  deeply  indebted  to  the  hard-pressed  company 
commanders  and  first  sergeants  who  took  time  to  tell  us  how  the 
information  program  operates  in  the  working  Army.  This  study  is 
properly  dedicated  to  those  professionals,  and  we  hope  it  tells  some 
small  part  of  their  story. 
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REQUIREMENT 


The  study  we  are  reporting  was  planned  by  the  Amy  as  a  means  of 
evaluating  the  Army's  Command  Information  program.  When  we  say 
"Command  Information"  we  refer  to  general  Information  not  directly 
related  to  the  Army's  technical  mission,  which  senior  commanders 
consider  Important  to  disslminate  down  the  line  of  command.  It  In¬ 
cludes  a  wide  range  of  matters,  ranging  from  changes  in  personnel 
policy  to  the  Army's  position  on  weapons  systems  procurement.  For 
convenience  we  will  refer  to  Cownand  Information  as  "Cl". 

The  Army's  Cl  Program  provides  specific  Information,  identified 
as  important  by  the  Army,  to  individual  officers  and  soldiers  at  all 
levels.  When  it  works  it  should  ensure  that  major  policies  and  pro¬ 
grams  are  understood,  and  that  troops  understand  the  Army's  role  and 
mission,  as  that  role  is  seen  by  senior  command.  There  are  cases  in 
which  the  view  of  the  Department  of  Army  is  at  variance  with  the 
attitudes  of  some  soldiers  (such  as  concerning  race  relations)  or 
with  popularly  held  attitudes  (such  as  concerning  the  Soviet  threat). 
In  these  cases  the  mission  of  the  Information  program  is  to  ensure 
that  members  of  the  Army  at  least  understand  the  position  the  Army 
takes  in  pursuing  Its  constitutional  mission.  Soldiers  are  not 
required  to  agree,  but  they  need  to  understand  the  rationale  for  the 
Amy's  role. 

Because  of  its  importance,  the  Army  is  concerned  about  the 
effectiveness  of  the  Cl  program,  and  has  in  the  past  undertaken 
studies  of  its  effectiveness  and  of  the  comparative  vfalue  of  various 
media  or  vehicles  for  transmitting  the  program's  messages.  Research 
studies,  as  early  as  World  War  II  analyzed  the  "Why  We  Fight  Fight” 
series,  on  separate  scales  for  Information,  attitude  and  motivational 
effectiveness.  A  general  finding  of  most  studies  has  been  that  the 
Information  program  Is  never  fully  effective  in  delivering  informa¬ 
tion  (cognitive  content),  but  that  the  program  is  more  effective  in 
comnun icating  information  than  in  causing  attitudinal  or  behavioral 
change. 


Company  Commanders 

This  specific  study  focused  on  the  role  of  the  company  commander, 
as  a  "gatekeeper."  We  will  say  more  about  "gatekeepers"  in  a  moment. 
There  were  reasons  to  believe  that  company  commanders  were  the  key 
to  effectiveness  in  the  Army  Cl  Program.  To  begin  with,  they  are  In 


% 
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most  cases  the  point  at  which  the  Cl  program's  materials  distribution 
system  stops,  and  decisions  are  made  as  to  which  materials  will  be 
presented,  when,  and  how.  The  commander  Is  the  person  responsible 
for  formal  Commander's  Calls,  and  the  one  who  must  select  what  he 
will  say  to  his  troops  during  the  limited  time  they  are  assembled 
together.  Finally,  he  is  the  senior  authority  figure  In  the  Army 
chain  of  command,  the  one  who  is  most  personally  known  and  regularly 
seen  by  troops. 

The  company  commander  Is  therefore  probably  the  gatekeeper  whose 
decisions  are  most  Influential  for  the  Information  program.  There 
are  two  reservations  in  that  regard: 

First,  the  decisions  made  by  the  cownander  are  often  difficult  to 
distinguish  from  those  of  the  first  sergeant  and  other  orderly  room 
staff  -  such  decisions  may  either  be  based  on  the  recommendations  of 
others  or  may  In  fact  normally  bypass  the  commander.  Many  commanders 
do  not  actually  see  or  read  completely  the  Information  sent  them 
through  information  program  channels,  but  depend  on  others  to  screen 
those  materials  first. 


Second,  the  tenure  of  commanders  is  sometimes  brief,  and  the 
Impact  of  the  commander,  when  the  turnover  has  been  rapid,  may  be  less 
than  In  those  cases  where  the  commander  has  been  assigned  long  enough 
to  establish  his  position  and  to  develop  an  administrative  routine. 


Gatekeepers 


The  question  we  asked  was  about  the  commander's  behavior  as  a 
"gatekeeper"  of  information.  The  role  of  gatekeepers  was  a  key  to 
the  study.  Gatekeepers  determine  what  other  people  receive  through 
channels  of  cownunlcation. 


It  is  hard  to  fully  appreciate  how  dependent  each  of  us  Is  on 
the  content  of  communications.  What  we  know,  what  we  believe,  and  how 
we  behave  depends  on  the  Information  we  receive  through  formal 
conmunication  channels.  Each  human  being  can  know  directly  only  those 
events  which  happen  within  his  sight  and  hearing.  All  the  rest  of 
his  perception  of  the  world  results  from  indirect  experience,  received 
through  communications. 

Thus  most  members  of  a  modem  society.  Including  soldiers  In  the 
Army,  have  no  direct  experience  of  realities  such  as  floods  in  Bangla¬ 
desh,  the  space  program,  the  Soviet  Union,  or  the  fact  that  the  world 
Is  round.  These  things  are  observed  by  others,  and  communicated  via 
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a  host  of  private  or  public  channels.  Nevertheless,  most  people 
believe  in  the  detailed  existence  of  a  wider  world,  most  of  which 
they  have  never  seen. 

How  the  world  beyond  personal  experience  is  perceived  is  totally 
a  function  of  conmuni cat ions.  No  citizen  or  soldier  can  know  of  any 
event,  unless  that  event  is  reflected  in  the  media  he  receives. 

How  each  person  perceives  an  event  (such  as  the  war  In  Ethiopia)  Is 
totally  dependent  on  the  content  of  communications.  That  content  is 
necessarily  selective.  The  channels  of  communication  could  not  carry, 
and  individuals  could  not  digest,  all  that  happens  in  the  world  each 
day.  So  at  many  points  In  the  world's  communications,  there  are 
"gates"  or  "filters",  points  at  which  signals  are  sorted,  edited  and 
selected  before  being  passed  Into  the  next  channel.  "Gatekeepers"  are 
critical  causes  of  the  way  the  world  is  perceived  Ly  others. 

Gatekeepers  necessarily  exist  within  the  communications  of 
society,  the  government,  business,  private  institutions,  and  the  Army. 
Either  consciously  or  by  default,  gatekeepers  determine  what  signals 
will  be  received  by  others,  and  therefore,  how  others  will  perceive 
reality.  In  the  Army,  commanders  are  the  primary  gatekeepers  for 
information  flowing  with  the  chain  of  command.  While  they  are  not 
the  only  gatekeepers,  the  decisions  of  company  commanders  are  centrally 
important  in  determining  how  troops  and  junior  officers  perceive  the 
Army,  and  themselves.  In  relation  to  the  Army's  mission.  This  gate- 
keeping  role  is  vital  in  cattle  as  well  as  in  peacetime  affairs.  It 
has  never  before  to  our  knowledge,  been  the  subject  of  specific 
research. 

Finally,  it  must  be  observed  that  in  the  company  commander's  case, 
the  term  "gatekeeper"  does  not  adequately  suggest  his  information  role. 
Gatekeepers  in  government,  media  and  business  often  function  In  a 
manner  closely  analogous  to  the  gate  and  ^i 1  ter  functions  of  a  computer 
they  pass  or  process  Information.  But  the  company  comnander  Is  in 
addition  an  active  medium  of  transmission  and  display.  Commanders 
are  active  advocates.  How  they  promote  a  policy,  and  how  they  display 
their  personal  concern,  will  determine  how  their  subordinates  behave 
to  a  far  greater  extent  than  is  the  case  elsewhere  in  government, 
business  or  Industry. 

We  have  explained  that  the  objective  of  this  study  was  to  define 
the  company  commander's  role  as  a  gatekeeper  of  Cl.  We  will  now 
describe  the  methods  of  the  study. 


METHOD 


In  designing  a  method  for  this  study  we  recognized  a  couple  of 
problems  to  be  overcome.  One  was  that  we  did  not  yet  know  exactly 
what  questions  would  be  most  useful  to  ask,  and  the  other  was  that 
company  commanders  were  likely  to  be  on  the  defensive  when  asked  about 
their  Cl  programs.  We  had  a  good  method  for  exploring  the  wording 
of  questions  In  the  "Q-sortM  technique,  but  the  question  of  defensive¬ 
ness  was  more  serious. 

We  know  that  military  commanders  are  not  likely  to  be  frank  when 
asked  how  they  do  things.  They  are  used  to  more  or  less  constant 
Inspection,  and  habitually  tell  outsiders  the  story  which  will  make  the 
unit  look  good.  But  we  needed  to  know  what  actually  happened  to  Cl 
at  the  company  level,  as  contrasted  to  what  the  regulations  or  policy 
might  say.  We  knew  that  most  commanders  would  not  be  using  all  the 
Cl  materials  fully,  and  that  In  fact  commanders  have  to  avoid  many 
directed  responsibilities  in  order  to  survive. 


Structured  Interview 


Informality  and  personal  contact  were  obviously  required.  We 
decided  to  use  a  structured  interview,  with  open-ended  responses,  so 
that  we  could  ask  stimulating  questions  and  then  let  the  respondents 
tell  us  what  was  on  their  minds.  The  three  Interviewers  were  each 
former  members  of  the  military,  with  personal  experience  as  a 
commander.  This  was  our  most  valuable  asset.  But,  first  we  had  to 
decide  what  to  ask. 


The  Questionnaire 

The  Army  provided  us  with  a  list  of  questions  which  needed 
answering.  Those  concerned  the  effectiveness  of  several  publ ications, 
such  as  Soldiers  magazine  ar.d  Commander's  Call,  an  information 
pamphlet!  fn  addition,  we  were  to  explore  the  effectiveness  of  media 
such  as  radio,  movies  and  TV,  and  to  explore  opinions  at  the  company 
level,  about  the  Cl  program.  This  list  of  general  questions  had  to  be 
pruned  down  to  a  practical  number  of  concrete  questions,  and  stated  In 
terms  which  would  provoke  useful,  measurable  responses. 
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Case  Study 


A  preliminary  list  of  questions  was  drawn  up,  reflecting  our  best 
estimate  of  what  questions  would  be  useful  and  how  they  should  be 
phrased.  Those  questions  were  tested  in  a  case-study  of  10  Artny  units. 
In  each  unit  we  interviewed  the  cownander  and  his  first  sergeant; 
four  units  were  In  the  combat  arms,  four  were  combat  support  ar.d  two 
were  In  Amy  schools. 

The  interviewers  used  approximately  two  hours  in  talking  to  each 
respondent.  The  questions  were  asked  Informally,  in  a  conversational 
manner,  and  respondents  were  encouraged  to  answer  informally  and  at 
length,  longhand  notes  w ire  taken.  In  many  cases  recording  the  exact 
language  of  key  pnrases  or  sentences.  New  ideas  and  perceptions  were 
picked  up  and  explored  further. 

A  principal  purpose  of  these  case  studies  was  to  discover  how  the 
issues  were  perceived  at  company  level,  so  that  our  interview  would 
reflect  the  language,  issues  and  perceptions  of  unit-level  command 
rather  than  those  of  Amy  headquarters. 

Responses  of  commanders  and  their  first  sergeants  were  compared 
to  detect  unreliable  data,  and  differences  between  enlisted  and  officer 
perception. 


Q-Sort  Scoring 

A  Q-sort  procedure  was  used  to  Identify  scorable  responses.  By 
this  wc  mean  that  a  panel  of  analysts  looked  at  each  question,  the 
analysts  taking  one  question  at  a  time  and  working  independently.  For 
each  question,  each  analyst  sorted  the  responses  into  categories  of 
similar  responses.  A  typical  question  will  usually  yield  from  two  to 
eight  recurring  responses  (plus  a  miscellaneous  category).  Sets  of 
responses  emerge  which,  though  phrased  differently,  reflect  similar 
opinions.  The  panel  then  meets,  compares  Its  scoring  systems,  and 
agrees  on  a  cownon  scoring  scheme.  This  method  makes  it  feasible  to 
score  well -designed,  open-ended  questions  as  if  they  were  multiple- 
choice. 


Model  Description 

From  these  data  a  preliminary  logical  gated-flow  diagram  was 
developed,  to  be  confirmed  and  expanded  later.  That  diagram  described 
the  possible  patterns  of  flow  for  Cl  information  within  an  Army 
company. 


Mini  Field  Test 


As  a  result  of  the  case  study,  major  modifications  were  made  to 
the  wording  of  questions  and  the  question  sequence.  A  preliminary 
field  test  was  then  conducted  at  Ft.  Belvoir,  Virginia.  Following  this 
test  toe  procedures  just  outlined  were  repeated.  Questions  were 
scored,  their  language  was  improved,  a::d  minor  adjustments  rade  in 
format  and  sequence.  A  survey  instrument  resulted  which  wau  clearly 
in  the  language  of  users,  was  easy  to  Q-sort,  which  produced  interest¬ 
ing  data,  and  which  was  reliable  to  apply. 


Field  Test 

This  survey  instrument  then  underwent  a  *inal  rigorous  field  test. 
Twenty-nine  units  were  surveyed  ac  Forts  Bragg „  01  x  and  Lee.  A  total 
of  26  company  commanders  and  29  first  sergeants  were  contacted,  in  10 
combat  arms,  9  combat  support,  and  10  school  companies.  The  data  were 
scored  and  analyzed  after  which  minor  changes  were  made,  but  it  was 
then  determined  that  the  field  test  data  were  sufficiently  reliable  to 
include  in  the  final  report. 


Final  Survey 

A  total  of  102  units  (61  combat  arms,  21  combat  support  a ad  20 
schools)  from  Forts  Benning,  Bragg,  Hood  and  Polk  were  included  in 
the  final  survey.  Eighty-eight  compan.,  commanders  and  eighty-four 
first  sergeants  were  interviewed.  Data  from  the  Field  Test  and  the 
final  survey  were  consistent.  However,  the  structured  interview  was 
changed  in  minor  respect,  following  the  Field  Test,  so  the  data  from 
the  Field  Test  and  Survey  am  cot  In  every  case  fully  comparable. 
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Each  installation  involved  in  the  final  survey  was  visited  by  two 
or  more  interviewees.  This  was  required  in  part  to  lessen  the 
probability  of  interviewer  bias  affecting  the  data.  Additionally, 
the  Interviewers  conducted  at  least  two  interviews  jointly  at  each 
installation  --  one  Interviewing  while  the  other  observed,  then  re¬ 
versing  roles  for  the  second  interview.  This  procedure  ensured 
standardization  of  Interviewing  technique.  Interviews  were  conducted 
in  a  company  setting,  typically  the  office,  n*»ss  hall,  training  room 
or  dayroom  of  the  unit  concerned. 

Including  the  final  survey,  a  t^tal  of  8  Installations  (Forts 
Belvolr,  Benning,  Bragg,  Dix,  Hoed,  Lee,  Meade  and  Polk)  and  144 
units  were  visited  during  the  course  of  the  entire  study.  The  total 
Interview  population  included  127  COs  and  124  first  sergeants.  Table 
1  reflects  the  composition  of  this  population  by  phases. 


TABLE  1 

Persons  Interviewed,  By  Phase 


Phase 

Units 

COs 

1st  3Gs 

Case  Study 

10 

10 

8 

Preliminary  Field  Test 

3 

3 

3 

Field  Test 

29 

26 

29 

Formal  Survey 

102 

88 

84 

TOTALS 

144 

127 

124 
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Data  Analysis 


All  responses  were  evaluated  using  the  Q-sort  procedure.  The  data 
were  posted  to  »  master  tabular  matrix,  and  analyzed  by  Inspection. 
Where  the  findings  were  Interesting  and  significant,  they  were  pulled 
out  as  sunmary  tables.  Table  2,  for  Instance,  reflects  the  opinions  of 
commanders  and  of  first  sergeants  as  to  whether  enlisted  men  read 
Soldiers.  It  shows  general  agreement  that  Soldiers  Is  widely  read,  but 
that  readership  Is  lower  among  less  senior  enlistees . 


TABLE  2 

Question:  What  percentage  of  your  men, 
by  grade,  read  Soldiers? 


Interviewees 


Grade 

COs 

1st  SGS 

E1-E4 

48% 

47% 

E5-E6 

69% 

67% 

PSGs 

83% 

88% 

Platoon  Leaders 

83% 

88% 

A  final  diagram  of  Information  flow,  as 
was  developed  (Table  3).  That  flow  was  used 
findings. 


reported  by  respondents, 
further  in  sporting  the 
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FINDINGS 


Our  findings  were  reported  to  the  Amy  in  a  series  of  briefings, 
and  in  a  final  report.  Most  of  these  have  to  do  with  ndividual  Army 
publications  and  are  not  of  general  Interest.  He  do  want  to  report  a 
few  things  which  are  of  more  general  Interest,  and  which  probably  apply 
in  some  measure  to  the  Navy  and  Air  Force,  as  well.  He  will  limit 
ourselves  to  five  points. 


Gatekeeping  and  Conwand 

In  general,  three  dating  behavior  models  were  discerned.  Table  3 
applies.  Those  models  are  described  following: 


Commander  is  Prime  Gatekeeper 

In  about  1/3  of  all  units,  the  CO  did  personally  receive,  review, 
and  make  principle  decisions  concerning  actions  to  be  taken  on  command 
information.  In  those  cases,  he  would  typically  read  Incoming  material 
selectively,  evaluate  its  importance,  and  mark  it  for  the  attention  of 
others  and  for  further  dissemination  through  formations,  distribution 
or  posting. 


First  Sergeant  is  Gatekeeper 

In  a  slightly  smaller  percentage  of  units  this  function  was  per¬ 
formed  by  the  first  sergeant.  Four  situations  -were  typical:  (1)  I<t 
more  than  50T  of  cases  the  CO  specifically  delegated  responsibility  for 
reading  and  screening  command  information,  and  acted  only  on  items 
called  to  his  attention.  (C)  In  other  cast*s,  the  CO  continued  the 
practice  which  existed  in  the  unit  prior  to  his  assignment,  or  other¬ 
wise  found  himself  within  a  pattern  of  established  unit  behavior.  This 
often  involved  Army  "regulations"  or  policies  and  existing  bOPs, 
directing  the  gating  of  information  within  the  unit.  This  was  perhaps 
the  most  interesting  of  the  gating  behavior  models  observed;  it  was 
identified  early  in  the  survey  when  one  of  the  COs  stated  that  there 
was  a  unit  SOP  for  the  handling  and  dissemination  of  Cl,  When  he  was 
asked  who  directed  this  SOP,  he  stated  that  he  didn't  know.  "It  was 
here  and  in  effect  when  I  got  to  the  unit  a  year  ago."  When  his  first 
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sergeant  was  asked  who  directed  the  SOP,  his  response  was  identical  to 
that  of  the  CO  with  one  (dfference:  he  had  been  the  first  sergeant  of 
the  unit  for  more  than  1 1  months,  ant  the  SOP  had  been  In  effect  when 
he  arrived,  too.  Where  Ci  dls semi  nation  SOPs  did  exist  In  units,  both 
COs  and  first  sergeants  were  asked  who  established  them.  In  the 
majority  of  the  cases  they  were  originated  by  someone  other  than  the 
Incumbent  COs  and  first  sergeants.  (3)  Then  there  were  cases  in  which 
a  strong  first  sergeant  assumed  responsibility,  without  deliberate 
delegation  by  the  CO.  (4)  Finally,  in  a  few  units,  the  COs  disinterest 
in  Cl  led  to  the  first  sergeant  assuming  responsibility  for  its  gating. 


TABLE  4 

Question  3  -  Who  decides? 
Correlation  of  CO  vs.  First  Sergeant 
Opinion 


Position 


Response 

COs 

1st  SI 

CO  only 

31? 

37% 

XO  only 

1% 

5% 

1st  SG  only 

24% 

31% 

CO  and  1st  SG 

36? 

26% 

Other 

8? 

1% 

Senior  Corcn«ander  Is  Gatekeeper 

A  few  units  were  observed  in  which  a  Senior  Commander  (Battalion 
or  Post-level)  had  pre-empted  the  gatekeeping  role,  for  Instance  by 
conducting  the  major  formations. 

Each  of  the  gating  behavior  models  clearly  represented  a  difference 
in  command  style.  If  would  be  interesting  to  know  what  effect  those 
differences  had  on  unit  effectiveness.  Pertinent  in  this  regard  is  the 
issue  of  mandatory  Cl  formations. 
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Mandatory  Cl? 


Some  company  leaders  would  like  to  see  the  Institution  of  manda¬ 
tory  Cl  formations.  They  remember  that  this  was  once  an  Amy-wide 
requirement  and  they  feel  that  it  would  strengthen  their  resolve— or 
their  negotiating  position  vice  other  activ1t1es--to  have  a  formal, 
scheduled  requirement  once  again. 

Opinions  of  this  recommendation  were  solicited  (Table  5).  Only 
161  of  COs  and  30%  of  first  sergeants  stated  that,  they  would  like  to 
see  mandatory  Cl  once  again.  The  remainder  of  those  interviewed 
(84%  of  the  COs  and  70%  of  the  first  sergeants)  replied  emphatically 
”no"— that  they  were  uniquely  aware  of  their  troops'  information  needs 
and  of  the  constraints  affecting  the  dissemination  of  that  information. 

We  recoiiiiier.d  against  any  renewed  requirement  for  these  reasons: 

o  Much  of  the  improvement  which  we  observed  in  the  Cl  program, 
and  in  attitudes  toward  Cl  among  officers  and  men,  seems  to 
result  from  a  free  hand  and  a  soft  sell. 

o  The  company  commander  in  most  Amy  units  is  actually  assessed 
more  mandatory  duty  than  he  or  his  unit  can  perform.  Further¬ 
more,  unit  schedules  are  complex  and  crowded.  Commanders  need 
as  much  freedom  and  local  authority  as  they  can  get.  We 
observed  several  units  with  good  programs  in  which  Cl  forma¬ 
tions  were  a  seasonal  activity;  they  did  not  occur  at  all  during 
training  season.  These  commanders  had  made  the  rational 
decision  not  to  try  Cl  classes  for  three/four  months  at  a  time, 
but  to  have  good  ones  when  time  permitted. 

o  The  Cl  message  has  the  greatest  punch  when  the  commander 
delivers  it— on  his  own  initiative,  because  he  believes  the 
message  important.  A  mandatory  formation  reduces  him  to  the 
role  of  a  passive  agent,  filling  a  requirement. 

A  mar.datory  formation  reduces  him  to  the  role  of  a  passive  agent, 
filling  a  requirement. 


TABLE  5 


Question:  Mandatory  Cl? 


Response 

COs 

1st  SGs 

Yes 

16% 

30% 

No 

84% 

70% 

Reading  Skill 

We  made  no  direct  measurement  of  reading  skill,  but  we  did  ask  a 
series  of  questions  about  who  can  and  does  read  Cl  publications.  The 
general  answer  was  that  second  enlistment  soldiers  read  the  publications, 
but  that  those  who  had  no  career  commitment  did  not.  At  all  junior 
levels  there  was  a  serious  reading  problem.  Table  6  applies. 


TABLE  6 

Question:  Who  can  understand  Soldiers? 


Response 


Unit  Type 

CA  CS  SC 


Everybody 


97%  39%  97% 


This  table  refers  to  Soldiers  magazine,  a  publication  which  is 
deliberately  popularized  and  has  the  simplest  reading  level  of  all  Cl 
publications.  Note  that.  In  the  opinion  of  unit  leaders,  all  members 
can  read  Soldiers  in  Combat  Arms  (CA)  and  School  (SC)  units.  In  Combat 
Support  ( CS )  units ,  however,  there  is  a  clear  recognition  that  some 
enlisted  men  cannot  understand  Soldiers.  Reading  all  the  other 
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publications  studied,  there  was  a  clear  belief  that  they  were  unreadable 
to  a  substantial  enlisted  population. 

This  suggests  that  any  Cl  program  which  depends  primarily  on  printed 
media  will  fail  to  reach  a  large  audience. 


Films 

We  found  a  surprising  lack  of  interest  in  films  (Table  7). 


TABLE  7 

Question:  Have  you  ever  used  any  Cl  films? 


Position 


Response 

COs 

1st  SGS 

Yes 

45% 

52% 

No 

55% 

48% 

Only  about  one  half  of  the  units  surveyed  had  ever  used  films  in 
the  Cl  program.  Of  those  who  had  used  films,  many  had  used  them  only 
because  they  were  directed  to  do  so,  and  fewer  than  25%  used  them  as 
often  as  once  a  month.  This  was  true  in  spite  of  the  fact  that  films 
were  readily  available.  An  alternative  to  films  might  be  electronic 
media. 


RECOMMENDATIONS 


Electronic  Media 

It  nas  been  widely  observed  tnat  modem  young  adults  are  of  a 
non-reading  generation.  For  whatever  original  reason,  they  are  much 
less  likely  to  turn  to  a  printed  source  for  Information  or  entertain¬ 
ment,  than  was  the  pre-TV  generation.  It  follows  that,  if  they  are 
to  be  reached  with  maximum  effectiveness,  they  will  be  reached  via 
radio  and  TV.  But  the  survey  found  a  low  level  of  interest  in,  or 
knowledge  of  existing  Cl  on  electronic  media.  This  Is  partly  because 
post  and  regional  Cl  programs  are  faced  with  a  dilemma  in  using 
electronic  media:  they  cannot  reach  their  audience  until  prime-time, 
and  then  they  are  unable  to  compete  with  commercial  programming. 

Cost  constraints  make  it  unlikely  that  the  Army,  or  local  Installa¬ 
tions,  can  compete  with  network  programs  for  the  attention  of  the 
soldier  audience.  Nevertheless,  radio  is  reaching  a  small  segment  of 
the  potential  audience  steadily  and  effectively,  and  TV  has  a  massive 
potential,  especially  to  reach  the  first  enlistment  soldier. 

Local ly  -  produced  TV  deserves  a  special  note.  It  is  hard  for 
installations  to  produce  good  footage,  except  for  local  news  ana 
commanders  as  talking-heads.  These  last  two  are  of  great  value. 

Finally,  the  Impact  of  TV  and  radio  on  dependents  should  not  be 
under-estimated.  Dependents  apparently  are  more  likely  to  be  reached 
by  daytime  programming,  and  may  actually  find  the  programming  more 
interesting.  They  like  to  know  what  the  Amy  Is  doing,  who  the  local 
leaders  are,  and  what  the  local  units  are  doing.  Soldier  husbands 
may  not  be  very  informative  in  this  regard,  and  families  are  known  to 
be  influential  in  such  vital  decisions  as  reenlistment.  We  recommend 
these  ideas  to  the  Army. 


Feedback 

It  is  a  general  principle  of  systems'  theory  that  messages  will 
be  effectively  transmitted  only  in  the  presence  of  feedback— Informa¬ 
tion  which  tells  whether  a  message  is  correctly  understood.  This  is 
more  than  a  theoretical  consideration;  perhaps  the  most  effective 
practical  way  to  improve  the  performance  of  human  organizations  is  to 
systematize  feedback.  That  principle  has  been  widely  used  to  Improve 
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Industrial  processes,  communl cations  networks,  and  instruction  in 
schools— to  name  only  a  few  cases. 

Several  questions  in  the  survey  asked  whether  COs  and  first 
sergeants  knew  what  information  their  troops  received  and  understood. 
The  worst  of  all  possible  conditions  was  found  to  occur:  company 
leaders  thought  they  knew,  but  in  fact  did  not  know  when  their 
communications  to  troops  were  received  and  understood. 

We  therefore  recommend  that  the  Army  consider  designing  Into  Its 
system  a  means  by  which  specific  key  Cl  messages  can  be  Identified, 
followed  through  the  information  system,  and  their  receipt  verified 
by  sampling  comprehension  at  the  troop  level.  Feedback  studies  could 
be  used  in  nortr?1  information  operations,  or  experimentally  to  deter¬ 
mine  what  information  strategies  are  most  effective. 


Reinforcement 


The  hypothesis  that  COs  and  first  sergeants  get  little  credit  for 
a  good  Cl  program  was  confirmed.  Senior  commanders  often  know  little 
about  company  Cl  programs— and  in  some  cases  do  not  care.  There  was 
little  monitoring  of  unit  Cl  programs  ir  most  of  the  units  :tudied. 

More  significantly,  Interviewees  did  not  believe  that  the  quality  of 
their  Cl  program  was  likely  to  affect  their  effectiveness  reports.  We 
were  apparently  observing  a  condition  in  which  unit  Cl  programs  were 
undertaken  largely  on  local  initiative  (there  were  outstanding  excep¬ 
tions,  in  which  battalion  or  more  senior  commanders  vigorously  encour¬ 
aged  good  Cl). 

Here  again,  a  feedback  device  would  solve  the  problem  automatically. 
If  a  battalion  commander  really  knew  in  which  of  his  companies  the 
troops  were  (and  were  not)  getting  the  word,  he  would  learn  a  lot  both 
about  the  Cl  program  and  the  effectiveness  of  his  subordinate  commanders. 
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POTENTIAL  OF  THE  METHOD 


At  the  beginning  we  said  that  the  methods  of  this  research  should 
be  applied  more  widely.  In  fact,  we  are  now  applying  similar  tech¬ 
niques  to  study  one  aspect  of  fire  control  in  the  Field  Artillery.  We 
hope  to  see  it  used  particularly  to  address  problems  of  two  kinds: 
morale,  and  tactical  intelligence. 

The  applicability  to  problems  of  morale  is  probably  clear  from  the 
context  of  this  report.  But  Kinton  believes  that  this  method  of 
research  might  lead  to  dramatic  enhancement  of  small  unit  combat 
potential. 

It  is  a  commonplace  observation  that  battle  takes  place  under 
conditions  of  inadequate  Information.  Researchers  have  repeatedly 
reported  what  every  soldier  knows— that  each  officer  and  man  at  the 
small  unit  level  fights  with  astonishingly  little  understanding  of  the 
situation  and  of  his  role.  This  Is  true  in  spite  of  the  fact  that 
men  at  the  company  front  may  observe  Information  vital  to  the  commander, 
or  to  the  squad  at  the  flank,  and  not  understand  the  value  of  that 
information  nor  report  it.  Staffs  at  battalion  level  do  not  know  what 
information  to  pass,  or  when  to  pass  it  to  the  men  at  company  and 
platoon  1  evv.1 . 

An  observation  from  REALTRAIN  exercises  has  been  that  good  internal 
Informal  comnunlcations  are  one  of  the  features  of  winning  platoons. 
However,  conrwnicatlon.  In  conventionally  trained  units,  tends  to 
freeze  permanently  on  contact  with  enemy  fire. 

This  matter  has  never  been  given  specific  study.  KINTON 
recommends  it  to  the  Army,  and  to  other  researchers. 
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WORKER-ORIENTED  &  JOB-ORIENTED  INSTRUMENTS  FOR 
EVALUriNG  JOB  PERFORMANCE1’2 


Robert  Vineberg 
Elaine  N.  Taylor 

Human  Resources  Research  Organization 
(HumRRO) 


^aper  presented  at  1977  Military  Testing  Association  Conference,  San  Antonio, 
Texas,  17-21  October  1977. 

2 

Sponsored  by  the  Naval  Education  &  Training  Command  a*d  the  Personnel  &  Train¬ 
ing  Research  Program,  Office  of  Naval  Research,  Contract  N00014-75-C-0938 
(NR  156-047).  The  project  monitor  was  Dr.  Marshall  Farr,  Director,  Personnel 
4  Training  Research  Program. 
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WORKER-ORIENTED  &  JOB-ORIENTED  INSTRUMENTS  FOR 
EVALUATING  JOB  PERFORMANCE  1.2 


I  am  going  to  describe  two  instruments  for  obtaining  performance  ratings. 
These  two  instruments  have  been  developed  at  the  task  and  element  level  of  jobs. 
These  instruments,  the  Performance  Analysis  Inventory  (PAI)  and  the  Task  Pro¬ 
ficiency  Inventory  (TPI),  were  developed  as  part  of  an  ONR  study  concerned  witn 
the  performance  capabilities  of  men  at  different  aptitude  levels.  The  part  of 
the  study  concerning  aptitude  levels  has  not  yet  been  undertaken. 

In  developing  these  Instruments,  we  rollecteo  data  on  men  in  ten  Navy 
ratings  and  three  pay  grades  aboard  the  aircraft  carriers  Enterprise  and  Con¬ 
stellation. 

Depending  on  the  job  being  performed  and  the  type  of  Instrument,  the  forms 
contain  between  34  and  93  rating  items.  A  particular  feature  of  this  study 
then.  Is  the  development  of  information  about  performance  in  many  elements  of 
a  job,  rather  than  with  regard  to  a  few  global  measures. 

The  rating  instruments  are  based  upon  two  different  models  of  job  analysis. 
Ernest  McCormick  (1972)  has  referred  to  these  approaches  as  worker-oriented  and 
job-oriented  models.  A  workar-oriented  approach  focuses  on  elements  of  behav¬ 
ior  that  generalize  across  tasks  and  jobs.  For  example,  observing  visual  dis¬ 
plays,  obtaining  Information  from  written  materials,  using  non-precision  tools, 
activating  variable  setting  controls,  following  fixed  procedures,  estimating 
quantity,  analyzing  Information,  or  negotiating  with  people. 

A  job-oriemcd  approach  to  job  analysis  focuses  on  specific  technological 
elements  of  job  content.  For  example,  repairing  carburetors,  drafitng  business 
letters,  annealing  copper  tubing,  organizing  stock  control  functions,  or  trans* 
lating  Russian  newspaper  articles. 

Forms  of  our  worker-oriented  Instrument,  the  PAI,  are  based  upon  elements 
of  jobs  taken  from  McCormick's  job  analysis  questionnaire,  the  Position  Analy¬ 
sis  Questionnaire  or  PAQ.  The  jobs  in  the  study  were  first  analyzed  with  a 
modified  form  of  the  PAQ  that  we  developed  for  use  in  the  Navy.  Each  job  was 
analyzed  by  rating  the  relevance  or  importance  of  each  of  139  possible  worker- 
oriented  elements.  Then,  performance  rating  scales  were  developed  for  each 
element  of  importance  that  emerged. 


*Paper  presented  at  1977  Military  Testing  Association  Conference,  San  Antonio, 
Texas,  17-21  October  1977. 

^Sponsored  by  the  Naval  Education  A  Training  Command  ano  the  Personnel  &  Train¬ 
ing  Research  Program,  Office  of  Naval  Research,  Contract  N00014-7E -C-0938 
(NR  156-047).  The  project  monitor  was  Dr.  Marshall  Farr,  Director,  Personnel 
i  Training  Research  Program. 
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TABLE  1 


This  slide  shows  the  jobs  In  the  study  and  the  number  of  worker-oriented 
items  included  in  the  various  forms  of  the  Performance  Analysis  Inventory  for 
obtaining  perforaai.wC  evaluations. 

Forms  of  our  job-orieoted  rating  instrument,  the  Task  Proficiency  Inven¬ 
tory,  were  based  on  task  inventory  data  furnished  by  the  Navy  Occupational  Task 
Analysis  Program  (NOTAP).  Here,  we  used  existing  task  analysis  data  to  identi¬ 
fy  elements  of  performance  to  be  evaluated. 

OVERLAY  TO  TABLE  1 

This  overlay  adds  the  number  of  job -oriented  items  included  In  the  various 
forms  of  the  Task  Proficiency  Inventory  for  obtaining  performance  evaluations. 
Here,  separate  instruments  were  developed  by  pay  grade  where  such  job  analysis 
data  was  available.  At  the  tine  we  constructed  the  scales,  NOTAP  job  analysis 
data  were  not  available  for  Electrician's  Mate,  Hull  Maintenance  Technician, 
and  Interior  Communications.  The  essential  feature  of  both  methodologies  is 
that  performance  Is  defined  ai.d  evaluated  in  terms  of  very  specific  behavioral 
or  technological  referents. 


TABLE  2 

This  slide  shows  some  sample  items  from  the  worker-oriented  PAI  for  Aviation 
Boatswain's  Mate  -  Equipment. 


TABLE  3 

This  slide  shows  some  sample  items  from  the  job-oriented  TPI  for  the  same 

job. 


To  provide  a  basis  for  item  analysis,  performance  data  were  obtained  for 
569  incumbents  in  the  ten  jobs.  For  comparative  purposes,  we  also  obtained 
Performance  Evaluation  Report  scores.  This  Is  the  instrument  that  is  used 
operationally  in  the  Navy  for  evaluating  a  man's  performance. 

FIGURE  la 

This  slide  shows  frequency  distributions  of  scale  value  usage  for  E-3  and 
F.-4  on  the  different  Instruments.  The  worker-oriented  PAI  and  the  job-oriented 
TPI  have  seven-point  scales,  while  the  Performance  Evaluation  Report  has  ten- 
point  scales.  The  distributions  are  displayed  with  the  mid-points  of  the  scales 
coinciding  in  order  to  avoid  the  distortion  that  occurs  if  data  from  one  type 
of  scale  are  expressed  in  terms  of  the  other. 

The  PAI  &  TPI  show  less  skew  than  the  operational  instrument,  but  It  must 
be  pointed  out  that  here  we  are  comparing  experimental  and  operational  data. 
Obviously,  we  do  not.  know  what  characteristics  our  instruments  would  demonstrate 
if  they  were  administered  on  a  continuing  basis  by  military  personnel. 


FIGURE  lb 


This  slide  shows  the  distributions  for  E-5  and  all  pay  grades  combined. 

As  always,  skewness  increases  with  grade.  The  job  element  or  task  level 
approach  has  not  overcome  this  problem.  Frequency  data  in  the  handout  shows 
respectable  distributions  for  our  Instruments  at  the  E-3  level ,  but  deteriora¬ 
tion  setting  in  at  the  £-4  level. 

Tne  handout  also  contains  tables  of  means,  standard  deviations  of  subject 
means,  subject  standard  deviations,  and  item  standard  deviation.  These  analy¬ 
ses  were  undertaken  to  look  for  relative  leniency,  halo,  and  discrimination 
between  the  two  experimental  rating  Instruments.  Comparisons  among  these  sta¬ 
tistics  show  less  leniency  and  halo  and  better  discrimination  for  the  worker- 
oriented  scalds  than  the  job-oriented  scales. 

This  completes  my  presentation.  We  presently  are  awaiting  permission  to 
use  these  Instruments  in  the  second  phase  of  our  study  to  collect  data  on  the 
performance  of  men  at  different  aptitude  levels. 


REFERENCE 

Ernest  J.  McCormick,  Paul  R.  Jeanneret,  ft  Robert  C.  Hecham,  "A  study  of  job 
characteristics  and  job  dimensions  as  based  on  the  position  analysis 
questionnaire  ( PAQ ) *' ,  monograph.  Journal  of  Applied  Psychology ,  Vol .  56, 
No.  4,  August  1972. 
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TABLE  1 


NUMBER  OF  ITEMS  IN  THE  PERFORMANCE  ANALYSIS  INVENTORY  (PAI) 

&  THE  TASK  PROFICIENCY  INVENTORY  (TPI)  BY  NAVY  JOB  &  PAY  GRADE. 


NAVY  JOB 

PAI 

f - 

TPI 

"  “1 

Aviation  Boatswain's  Mate 

' 

I 

E4 

E5  | 

Equipment  (ABE) 

56 

'  40 

40 

63  1 

Fuel  (A8F) 

49 

!  40 

40 

50 

Handling  (ABH) 

56 

j  40 

40 

50  | 

Aviation  Ordnance  (AO) 

45 

54 

| 

40 

51  I 

Electrician's  Mate  (EM) 

50 

! 

| 

— 

— 

Hull  Maintenance  Technician  (HT) 

47 

1  __ 

— 

— 

Interior  Communication  (1C) 

49 

I  __ 

— 

—  1 

Mess  Management  Specialist 

1 

(MS-S2  Division) 

39 

1  54 

40 

56 

(MS-S5  Division) 

38 

R3 

| 

68 

86  1 

StoreFeeper  ( SK ) 

34 

56 

40 

61 

t 

OVERLAY  TO  TABLE  1 
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TABLE  2 

SAMPLE  ITEMS  FROM  WORKER-ORIENTED  PAI  FOR 
AVIATION  BOATSWAIN'S  MATE  -  EQUIPMENT 


2.  Work  produced  using  energy- powered  tools  to  perform  operations  not 

requiring  great  accuracy  or  precision.  (Electric  grinders  and  drills, 
welding  equipment,  brazing  gear,  skill  saw,  etc.) 


Exceptionally 

Good 

Satisfactory 

Exceptionally 

Poor 

Never 

Has  to 

7  6 

5  4  3 

2  1 

X 

3.  Work  accomplished 
using  mechanical 

using  handling  devices, 
fingers,  etc.) 

(Pouring  zinc  from 

ladles. 

Exceptionally 

Efficient 

Satisfactory 

Exceptionally 

Incf f icient 

Never 

Has  to 

7  6 

5  4  3 

2  1 

X 

37.  Remembering  information  for  a  brief  period  of  time.  (Console 
launch  valve  strobe  timer  readings,  steam  pressures,  etc.) 

recorder. 

Very  Reliable 

Satisfactory 

Very  Unreliable 

7  6 

5  4  3 

2  1 

41.  Being  aware  of  and  alert  '.o  the  condition/quality  of  equipment  material 
or  weapon  systems.  (For  oxample,  condition  of  components  in  catapult 
and  recovery  gear,  etc.) 

Exceptionally 

Aware 

Satisfactory 

Exceptionally 

Unaware 

7  6 

5  4  3 

2  l 

42.  Being  accurate  in  transcribing.  (Copying 
tion  for  later  use;  water  brake  readings. 

or  posting  data  or  informa- 
fluid  history  reports,  etc.) 

Except ionally 

Accurate 

Satisfactory 

Very 

Inaccurate 

Never 

Has  to 

7  6 

5  4  3 

2  1 

V 

A 

53.  Obtaining  Job  information  by  seeing  differences  using  far  vision. 
(Deck  edge  operator,  aircraft  identification  to  determine  correct 
settings  for  arresting  gear,  etc.) 


Exceptionally 

Good 

Sat isfnetory 

7  6 

5  4 

Exceptionally 

Poor 


Never 
Has  to 
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TABLE  3 


i  Very 
)  Effective 


Average 


Very  Never 

Ineffective  Has  to 


□  17. 

□  18. 

□  19. 

□  20. 
□  21. 
□  22. 

□  23. 

□  24. 
O  25. 

D  26. 

□  27. 

□  28. 

□  29. 

□  30. 


Applying  preservatives  to  cables  (COPS,  purchase  cables, 
bridles,  etc.). 

Cleaning  hydrauHc  filters. 

Stowing/breaking  out  parts/equipment. 

Rigging  the  barricade. 

Changing  Bridle  arrestor  straps. 

Replacing  "0"  rings  in  valves/cy linders . 

Painting  safety  markings  on  flight  deck. 

Repacking  the  retract  valve. 

Maintaining  logs/records  (catapult,  flight  deck,  fuels,  etc.), 
Participating  in  working  parties. 

Functionally  checking  catapults  by  firing  no-loads. 

Ensuring  safety  lines  are  in  place  during  no-load  firings. 
Changing  purchase  cable  on  "AC"  engines  (re-reeve). 

Safety  wiring  equipment/gear/switches. 


f~]  31.  Measuring  slipper  wear. 


Figure  la.  Frequency  Distr<butions  (in  %)  of  Scale  Value  Usage  for 

Three  Rating  Instruments  Drawn  With  the  Mid-Points  of  Scales  Coinciding. 


UGEKO 


Frequency  Distributions  (in  »)  of  Scale  Value  Usage  for 

Three  Rating  Instruments  Drawn  With  the  Mid-Points  of  Scales  Coinciding. 


TABLE  4. 


FREQUENCY  OF  USAGE  OF  SCALE  VALUES  FROM  E3  DATA  ON 
THREE  RATING  INSTRUMENTS  (ALL  NAVY  JOBS,  ALL  ITEMS) 


m 


SCALE 


VALUES 

f 

X 

f 

1 

420 

4.14 

144 

2 

686 

6.77 

180 

3 

1306 

12.88 

355 

4 

2856 

28.18 

943 

5 

2394 

23.62 

851 

6 

1841 

18.17 

786 

7 

631 

6.23 

320 

Never 

Has  To 

1615 

1585 

TOTAL* 

10134 

99.99 

3579 

TPI 

PER 

X 

X 

NAVY 

SCALE  VALUE 

f 

4.02 

1.0 

1 

.16 

5.03 

2.0 

7 

1.10 

9.92 

2.6 

12 

1.88 

26.35 

2.8 

31 

4.85 

23.78 

3.0 

34 

5.32 

21.96 

3.2 

80 

12.52 

8.94 

3.4 

135 

21.13 

3.6 

174 

27.23 

3.8 

142 

22.22 

4.0 

23 

3.60 

100.00 

639 

100.01 

•Totals  c‘o  not  include  "Never  Has  To"  perform. 


TABLE  5.  FREQUENCY  OF  USAGE  OF  SCALE  VALUES  FROM  E4  DATA  ON 
THREE  RATING  INSTRUMENTS  (ALL  NAVY  JOBS,  ALL  ITEMS) 


PAI 

TPI_ 

PER_ 

SCALE 

NAVY 

VALUES 

JL 

X 

f_ 

X  _ 

SCALE  VALUE 

f 

X 

1 

95 

0.92 

37 

1,00 

1.0 

__ 

2 

241 

2.32 

50 

1.35 

2.0 

~ — 

— 

3 

999 

9.63 

209 

5.63 

2.6 

4 

.46 

4 

2349 

22.65 

762 

20.53 

2.8 

18 

2,09 

5 

2821 

27.20 

909 

24.50 

3.0 

30 

3.48 

6 

2758 

26.60 

1122 

30.23 

3.2 

47 

5.45 

7 

1107 

10.68 

622 

16.76 

3.4 

123 

14.27 

Never 
Has  To 

3.6 

292 

33.P3 

683 

1444 

3.8 

4.0 

279 

69 

32.37 

8.01 

TOTAL  * 

10370 

100.00 

3711 

100.00 

862 

100.01 

•Totals  do  not  include  "Never  Has  To"  perform. 
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TABLE  6.  FREQUENCY  OF  USAGE  OF  SCALE  VALUES  FROM  E5  DATA  ON 
THREE  RATING  INSTRUMENTS  (ALL  NAYY  JOBS,  ALL  ITEMS) 

PAI  TPI  PER 


SCALE 


VALUES 

f 

I 

f 

1 

4 

0.13 

2 

2 

14 

0.46 

16 

3 

98 

3.18 

13 

4 

355 

11.53 

119 

5 

915 

29.71 

326 

6 

1052 

34.16 

571 

7 

642 

20.84 

449 

Never 

Has  To 

118 

459 

TOTAL* 

3080 

100.01 

1496 

% 

NAVY 

SCALE  VALUE 

f 

X 

0.13 

1.0 

_____ 

1.07 

2.0 

— 

— 

0.87 

2.6 

— . 

— 

7.96 

2.8 

— 

— 

21.75 

3.0 

1 

.98 

38.17 

3.2 

1 

.98 

30.01 

3.4 

4 

3.92 

3.6 

29 

28.43 

3.8 

47 

46.08 

4.0 

20 

19.61 

99.99 

102 

100.00 

♦Totals  do  not  include  "Never  Has  To"  perform. 


TABLE  7.  FREQUENCY  OF  USAGE  OF  SCALE  VALUES  FROM  E3-E5  OATA  ON 
THREE  RATING  INSTRUMENTS  (ALL  NAVY  JOBS,  ALL  ITEMS) 


PAI_  __TPI _  PER 


SCALE 

VALUES 

_  f  _ 

X 

f 

X 

NAVY 

SCALE  VALUE 

f 

X 

1 

519 

2.20 

183 

2.08 

1.0 

1 

.06 

2 

941 

3.99 

246 

2.80 

2.0 

7 

.44 

3 

2403 

10.19 

577 

6.57 

2.6 

16 

1.00 

4 

5560 

23.58 

1824 

20.76 

2.8 

49 

3.06 

5 

6130 

25.99 

2086 

23.74 

3.0 

65 

4.06 

6 

5651 

23.96 

2479 

28.22 

3.2 

128 

7.99 

7 

2380 

10.09 

1391 

15.83 

3.4 

262 

16.34 

Never 

Has  To 

2416 

3488 

3.6 

3.8 

4.0 

495 

468 

11c 

30.88 

29.20 

6.99 

TOTAL* 

23584 

100.00 

8786 

100.00 

1603 

100.02 

♦Totals  do  not  include  "Never  Has  To"  perform. 
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ABE 

4.44 

4.71 

5.05 

.1.29 

1.25 

.94 

46 

29 

3.42 

ABE 

4.13 

4.35 

5.35 

1.23 

1.13 

.68 

23 

13 

3.51 

ABH 

4.64 

4.97 

5.22 

1.18 

1.04 

1.06 

42 

33 

3.47 

AO 

4.30 

4.77 

5.28 

1.52 

1.37 

.83 

37 

24 

3.49 

EH 

4.36 

— 

4.94 

1.00 

— 

1.19 

17 

13 

3.35 

HT 

4.29 

— 

5.78 

1.28 

— 

.66 

19 

9 

3.63 

IC 

4.73 

— 

5.39 

.75 

— 

.33 

8 

6 

3.52 

MS-S2 

3.47 

3.82 

4.86 

1.22 

1.29 

1.20 

20 

12 

3.36 

MS-S5 

4.29 

4.56 

5.38 

1.33 

1.34 

.90 

33 

17 

3.52 

SK 

4.16 

4.63 

4.77 

1.23 

1.03 

.88 

10 

5 

3.30 

E4 

wr 

5.19 

5.25 

5.51 

.89 

.92 

.73 

22 

12 

3.55 

ABF 

5.00 

5.14 

5.67 

.87 

.92 

.64 

28 

18 

3.60 

ABH 

5.31 

5.31 

6.01 

1.12 

1.12 

.45 

30 

24 

3.70 

AO 

4.87 

5.02 

5.69 

1.29 

1.27 

.63 

36 

28 

3.61 

EM 

4.40 

— 

5.26 

1.10 

— 

.67 

31 

30 

3.48 

HT 

4.75 

— 

5.63 

.88 

— 

.94 

24 

22 

3.60 

IC 

4.45 

— 

5.30 

.99 

— 

.63 

15 

13 

3.49 

MS-S2 

5.01 

5.12 

5.73 

1.09 

1.03 

.64 

23 

16 

3.62 

MS-S5 

5.51 

5.90 

6.49 

.56 

.64 

.59 

12 

8 

3.85 

SK 

5.12 

5.66 

5.97 

.87 

.72 

.53 

18 

14 

3.69 

E5 

ABE 

5.45 

5.58 

6.50 

.99 

.87 

* 

NC 

9 

3 

3.89 

ABF 

5.61 

5.74 

5.87 

.63 

.59 

.17 

8 

4 

3.66 

ABH 

5.78 

5.86 

** 

NC 

NC 

** 

3 

** 

+* 

AO 

6.20 

6.14 

** 

1  .GO 

1.07 

** 

5 

** 

** 

EM 

5.03 

6.07 

.26 

— 

.58 

10 

4 

3.72 

HT 

NC 

— 

NC 

NC 

— 

NC 

2 

2 

NC 

IC 

4.67 

— 

NC 

NC 

— 

NC 

3 

2 

NC 

MS-S2 

5.35 

5.54 

NC 

.85 

.86 

NC 

11 

1 

NC 

HS-S5 

6.00 

6.26 

*4 

.87 

.60 

4* 

17 

** 

it* 

SK 

5.32 

5.73 

6.23 

1.19 

1.09 

.61 

7 

4 

3.77 

E3-F5 

ABE' 

4.77 

4.96 

5.28 

1.22 

1.16 

.94 

77 

45 

3.49 

ABF 

4.74 

4.91 

5.57 

1.12 

1.08 

.64 

59 

35 

3.57 

ABH 

4.95 

5.14 

5.55 

1.19 

1.07 

.94 

75 

57 

3.57 

AH 

4.69 

4.97 

5.51 

1.45 

1.34 

.75 

78 

52 

3.55 

EH 

4.49 

—  - 

5.22 

1.00 

— 

.87 

58 

48 

3.47 

HT 

4.62 

— 

5.72 

1.05 

— 

.86 

45 

34 

3.62 

IC 

4.56 

— 

5.35 

.87 

~ 

.56 

26 

22 

3.51 

MS-S2 

4.51 

4.72 

5.36 

1.35 

1.30 

.98 

54 

29 

3.51 

MS-S5 

5.00 

5.28 

5.74 

1.34 

1.32 

.96 

62 

25 

3.62 

SK 

4.89 

5.38 

5.78 

1.12 

.99 

.81 

J5 

23 

3.63 

OVERALL  4.73  5.04  5.49  1.23  1.20  .86  PAI  569  370 

TP I  440 


^he~nuin5er  of  cases  was  so  low,  that  the  statistic  was  not  computed  where  NC  is  shown. 
**No  data  were  available. 
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TABLE  9.  NUMBER  OF  TIMES  MEANS  ON  ONE  RATING  INSTRUMENT  EXCEED  ANOTHER 


PAI  >  TP! 

PAI  >  PER 

TPI  S.  PER 

E3 

0/7 

0/10 

0/7 

E4 

0/6* 

0/10 

0/7 

E5 

1/7 

0/7** 

0/4** 

*A  tie  occurred  in  one  comparison. 

**PER  data  were  not  available  for  3  Navy  jobs,  thus  reducing 
the  number  of  compariosns  that  could  be  made  between  PAI 
and  PER  to  7,  and  between  TPI  and  PER  to  4. 


TABLE  10.  NUMBER  OF  TIMES  STANDARD  DEVIATIONS  OF  MEANS  ON  CNE  RATING 
INSTRUMENT  EXCEED  ANOTHER 


PAi  >  TPI 

PAI  >  PER 

TPI  >  PER 

E3 

5/7 

9/10 

6/7 

E4 

3/6 

8/10 

7/7 

E5 

4/6 

2/3 

2/2 
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TABLE  11.  MEAN  OF  SUBJECT  STANDARD  DEVIATIONS 
FOR  THREE  RATING  INSTRUMENTS 


E3 

JJL 

E5 

E3-E5 

PA  I 

ABE 

0.84 

0.94 

0.59 

0.84 

ABF 

0.72 

0.67 

0.69 

0.69 

A8H 

0.86 

0.74 

0.63 

0.80 

AO 

0.65 

0.67 

0.73 

0.67 

EM 

0.75 

0.79 

0,68 

0.76 

HT 

0.55 

0.64 

NC* 

0.60 

IC 

0.71 

0.71 

0.75 

0.71 

MS-S2 

C.70 

0.71 

0.62 

0.69 

MS-S5 

0.69 

0.68 

0.60 

0.66 

SK 

0.76 

0.73 

0.66 

0.73 

TP  I 

ABE 

0.80 

0.83 

0.77 

0.81 

ABF 

0.76 

0.66 

0.65 

0.70 

ABH 

0.74 

0.78 

0.69 

0.75 

AO 

0.61 

0,68 

0.62 

0.64 

EM 

** 

++ 

** 

HT 

** 

** 

** 

IC 

** 

** 

** 

** 

MS-S2 

0.62 

0.71 

0.56 

0.65 

KS-S5 

0.57 

0.61 

0.56 

0.6/ 

SK 

0.51 

0.67 

0.66 

0.62 

PER 

ABE 

0.47 

0.66 

0.29 

0.51 

ABF 

0.67 

0.48 

0.31 

0.54 

ABH 

0.60 

0.47 

NC 

NC 

AO 

0.50 

0.54 

NC 

NC 

EM 

0.47 

0.60 

0.66 

0.57 

HT 

0.36 

0.44 

NC 

NC 

IC 

0,56 

0.52 

NC 

NC 

MS-S2 

0,n3 

0.47 

NC 

NC 

MS-S5 

0.33 

0.33 

NC 

NC 

SK 

0.47 

0.35 

0.46 

0.40 

♦The  number  of  cases  was  so  low,  that  the  statist! 
was  not  computed  where  NC  is  shown. 

**No  data  were  available. 


TABLE  12.  NUMBER  OF  TIMES  AVERAGE  SUBJECT  STANDARD  DEVIATIONS  ON  ONE 
RATING  INSTRUMENT  EXCEED  ANOTHER  FOR  A  GIVEN  PAY  GRADE 


FAI  >  TP1 

PA I  >^PSR 

TP  I  >  PER 

E3 

6/7 

10/10 

7/7 

E4 

4/6 

10/10 

7/7 

E5 

5/7 

4/4 

3/3 
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TABLE  13.  MEAN  OF  ITEM  STANOARO  DEVIATIONS  FOR  THREE  RATING  INSTRUMENTS. 


EJL 

E4 

EJL 

E3.di 

“n\, 

m 

XSt> 

XSD 

XSD 

XSD 

ABE 

1.52 

1.18 

1.14 

1.42 

ABF 

1.38 

1.09 

0.88 

1.26 

ABH 

1.40 

1.27 

0.62 

1.36 

AO 

1.60 

1.42 

1.17 

1.56 

EM 

1.26 

1.35 

0.74 

1.21 

HT 

1.42 

1.12 

0.16 

1.24 

IC 

1.03 

1.15 

0.86 

1,10 

MS-S2 

1.36 

1.31 

1.00 

1.49 

MS-S5 

1.39 

0.85 

0.98 

1.31 

SK 

i  .34 

1.07 

1.23 

1.27 

TPI 

ABE 

1.45 

1.19 

1.07 

1.28 

ABF 

1.26 

1.15 

0.81 

1.24 

ABH 

1.26 

1.27 

0.49 

1.09 

AO 

1.52 

1.44 

0.97 

1.18 

EM 

** 

** 

** 

** 

HT 

*# 

** 

** 

** 

IC 

** 

** 

** 

MS-S2 

1.52 

1.22 

0.96 

1.45 

MS-S5 

1.52 

0.80 

0.84 

1.21 

SK 

1.05 

1.00 

1.05 

0.98 

PER 

ABE 

1.05 

0.88 

0.38 

1.03 

ABF 

0.72 

0.78 

0,40 

0.81 

ABH 

1.07 

0,60 

NC* ** 

0.98 

AO 

1,05 

0.82 

NC 

0.94 

EM 

1.36 

0,88 

0.75 

1.05 

HT 

0.61 

1,02 

0.28 

0.97 

IC 

Q,  55 

0.81 

0.56 

0.75 

MS-S2 

1,26 

0,80 

NC 

1.03 

MS-S5 

0.98 

0.61 

NC 

0.92 

SK 

1.03 

0.70 

0.78 

0  92 

*The  number  of  cases  was  so  low,  that  the  statistic  was 
not  computed  where  NC  is  shown. 

**No  data  were  available. 
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TA8LE  14-  NUMBER  OF  TIMES  AVERAGE  ITEM  STANDARD  DEVIATIONS  ON  ONE 
RATING  INSi RUMENT  EXCEED  ANOTHER  FOR  A  GIVEN  PAV  GRADE 


PAI  >  TP I 

PAI  >  PER 

TP  I  >  PER 

ET 

7/9 

9/10 

7/7 

E4 

3/6 

10/10 

'  7/7 

E5 

7/7 

5/7 

4/4 
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Pronotion  Evaluation  for  Inter-Organirational  Referrals 
A  behavioral  Expectation  Approach 

Charles  N.  HacLane 
U.S.  Civil  Service  Commission 


Introduction 

This  paper  describes  the  design  and  the  first  phase  of  development 
of  a  system  for  the  Initial  evaluation  of  Pederal  personnel  apecialists 
for  promotion  and  transfer  across  organisations.  It  was  created  to  be 
part  of  an  existing  automated  personnel  record  system  (called  the  Pederal 
Automated  Career  System)  which  presently  screens  individuals  on  back¬ 
ground  and  experiential  data  for  referral  to  agencies  other  than  their 
own.  The  existing  PACS  refsrrsl  procsss  is  limited  in  that  it  can 
validly  screen  individuals  for  particular  positions  only  in  teres  of  broad, 
general  categories  of  experience  and  background.  A  method  of  meaning* 
fully  aeeeseing  individuals  within  these  broad  categories  was  needed. 

It  was  considered  important  to  tailor  the  methodology  to  the  para¬ 
meters  of  the  proposed  assessment  system,  especially  in  view  of  trends 
in  current  research  on  performance  appraisals.  Grey  and  Kipnis  (1976), 
for  example,  found  that  variance  in  supervisory  ratings  of  performance 
showed  a  contrast  effectj  that  is,  the  store  employees  e  supervisor  had 
whom  he  labeled  "bad”,  the  higher  the  overall  swan  ratinn  he  gave  his 
employees.  Thu.*,  unless  there  is  some  means  of  standardising  ratings 
across  organisations,  different  ratings  may  be  given  based  on  the  number 
of  poor  employees  a  supervisor  has.  Thus  it  is  well  to  consider  care¬ 
fully  the  possibly  differential  ef facts  of  contextual  variables. 

In  the  present  study,  the  extent  of  coverage  of  the  proposed 
system  (about  12,000  personnel  specialists  in  four  major  specialty  areas 
and  located  in  hundreds  of  different  organisational  settings)  was  seen 
as  the  primary  variable  with  which  the  methodology  had  to  deal.  This 
in  turn  raised  two  issue:,  t  1)  Host  important  was  to  establish  that 
there  were  dimensions  of  personnel  work  and  of  things  which  workers  did 
within  these  dimensions  which  were  common  to  ail  or  smny  of  the  organi¬ 
sations  Involved;  and  2)  Tt  was  necessary  to  have  e  method  of  standard¬ 
ising  performance  levels  within  these  dismnsions  which  could  be  used 
across  organisations.  If  these  issues  could  be  resolved,  it  would  pro¬ 
vide  s  basis  for  Insuring  the  validity  of  the  appraisal  process. 

Howeyer,  given  the  extent  of  coverage,  it  was  not  expected  that  a  single 
study  would  provide  a  complete  enough  sampling  of  the  behavioral  domain 
to  ccmpletoly  address  the  two  issuee  raised  above.  Therefore,  a 
multi-stage  developmental  process  was  planned  in  which  the  initial 
operation  of  the  system  would  provide  data  needed  for  subsequent  stages. 
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Figure  I 

Algorithm  of  Aaseasaent  Process 
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In  general,  than,  the  project  would  proceed  as  follow*: 

1.  A  first-stage  assessment  instrument  would  be  developed  which 
would  provide  rating*  on  dimension*  of  personnel  work  coemon 
to  the  organisation*  involved. 

2.  There  would  also  newd  to  be  a  mechanisa  for  collecting  further 
data  on  the  work  dimension*  which  would  be  collect, id  simultane¬ 
ously  with  the  assessments. 

3.  Data  would  be  stored  on  each  individual  in  terms  of  each  di¬ 
mension  and  this  would  irterface  with  biographical  data. 

4.  Operationally,  a  request  for  referrals  would  be  mad*  in  terma 
of  dimensions  critical  to  the  open  position  and  a  referral  list 
generated  baaed  on  the  sum  of  critical  dimension  scores  plus 
the  biographical  data. 

5.  After  some  period  of  operations?  time,  the  data  on  the  work 
dimension  would  be  used  to  revise  and  expand  the  instrument. 
This  process  would  be  repeated  whenever  it  was  felt  necessary. 
Figure  1.  illustrate*  a  generalised  model  of  the  operational 

system. 


Methodology 

It  was  decided  to  adopt  the  behavioral  Expectation  Methodology  of 
Smith  and  Kendall  (1963)  because  the  development  and  operational  us*  of 
this  type  of  performance  appraisal  would  provide  much  of  the  data  re¬ 
quired. 

For  the  present  study,  a  sample  of  69  personnel  specialists  (re¬ 
presentative  of  the  grade  levels,  specialities  and  organisations  which 
were  to  be  included  in  the  system)  wrote  over  SCO  short,  critical- 
incident-type  statement*  describing  highly  sffective.  moderately  effec¬ 
tive.  and  ineffective  behaviors  in  parsonnel  work.  In  addition.  23 
broad  dimensions  or  factors  of  personnel  work  were  developed  by  *  panel 
of  five  very  knowledgeable  personnel  specialist*  from  list*  of  dimensions 
developed  previously.  (In  other  studies  utilising  this  methodology,  the 
dimensions  have  been  derived  from  groupings  of  the  behavioral  statements 
theamelves.)  The  69  specialists  thtn  placed  each  behavioral  stateawnt 
under  the  one  of  the  23  diawmsions  which  it  best  exemplified  and,  subse¬ 
quently,  indicated  on  a  1-1  point  scale,  the  level  of  effective 
performance  which  that  statement  represented  in  terms  of  the  dimension 
into  which  it  had  been  placed. 
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Personnel  Work  Oiiaen* ion* 


SOLVING  2.8  .8  4.2  1.0  5.8 


Th«  first  type  of  data  obtained  through  these  procedures  was  the 
number  of  placements  of  each  behavioral  statement  under  each  dimension 
as  a  proportion  of  the  total  number  of  placements  of  that  statement. 
Stateamnts  and  dimensions,  were  retained,  altered,  or  discarded  based 
on  a  60%  criterion  through  what  might  be  termed  a  qualitative  cluster 
analysis.  The  result  was  that  11  distensions  (each  exemplified  by  nine 
to  thirteen  statements)  remained.  Thus,  it  had  been  objectively 
established  that  personnel  work  in  the  agencies  represented  in  this 
sample  of  69  personnel  specialists  could  be,  at  least  in  part,  described 
by  these  11  dimensions  and  that  the  dimensions  had  similar  meanings  in 
common,  in  terms  of  behavior,  across  these  organisations .  Further,  the 
qualitative  cluster  analysis  presided  one  check  on  the  validity  of  both 
the  dimensions  and  behavioral  statements . 

The  second  type  of  data  gathered  -  means  and  standard  deviations 
of  the  effectiveness  ratings  on  the  seven-point  scale-addressed  the  issue 
of  standardisation  of  performance  levels  across  organisations.  The 
criteria  for  retention  were  that  the  means  of  the  statement  scale  scores 
be  distributed  so  as  to  represent  the  whole  range  of  the  scale  and  that 
the  standard  deviation  be  below  1.20.  (Only  the  statements  passing  the 
60%  criterion  were  used.) 

Three  behavioral  statements  were  able  to  be  retained  for  each 
dimension  although  it  had  been  hoped  that  five  could  be  used.  As 
indicated  earlier,  it  was  expected  that  there  would  be  difficulty  in 
finding  qualified  behavioral  statements  without  more  sampling  of  the 
behavioral  domain  than  was  possible  in  this  initial  study.  Table  1- 
shows  the  final  dimension  definitions  and  the  swans  and  standard  devia¬ 
tion:}  of  the  effectiveness  ratings  for  their  statmaents. 


Performance  Appraisal  Pre-Test 

Although  scales  which  had  been  developed  were  less  inclusive  than 
had  been  hoped,  it  was  decided  to  proceed  with  the  pre-test  because, 
given  that  the  selection  ratio  would  be  very  low,  that  the  instrument 
was  for  short-term  use,  and  that  there  were  no  assessment  procedures 
available  at  present,  an  instrument  of  low  validity  could  still  be  of 
considerable  utility. 

The  first  instrument  tested  was  a  supervisory  rating  form  based  on 
the  retained  dimensions  and  behavioral  statements.  The  primary  analysis 
was  of  intra-obeorver  reliability,  inter-observer  reliability  was 
considered  less  satisfactory  because  of  problems  associated  with  ob¬ 
taining  raters  and  recent  studies  showing  that  ratings  may  be  affected 
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Table  2 


Cooperative  Error  Statistics  -  Supervisor  end 

Self-Assessaant  Data 


Dimension 

Supervisor 

Self -Assessment 

I 

.55 

Scalability  Coefficients 
.55 

II 

.45 

.64 

III 

.57 

.68 

IV 

.45 

.51 

V 

.46 

.46 

VI 

.60 

.56 

VII 

.31 

.62 

VIII 

.60 

.68 

IX 

.36 

.70 

X 

.18 

.42 

Xi 

.51 

.62 

Estimated  Total  Rater  Error  (%) 


26.7  8.9 
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by  organizational  level,  ate.  (e.g.  Zedeck  and  Bakar/ 1977.) .  The  Liek 
and  Matthews  (1968)  developmental  acaling  procedure  was  adopted.  Since 
the  three  behavioral  statements  exemplifying  each  dimension  had  been 
ordered  from  ineffective  to  effective,  the  ratees  could  be  rated  on  each 
separately.  The  Liek  and  Matthews  procedure  gives  a  measure  of  the 
inconsistency  between  the  order  of  rating  and  the  order  in  which  the 
statements  had  been  scaled.  This  coefficient  of  scalability  is  similar 
to  a  Guttman  coefficient  but  it  is  a  finer  awasure  in  the  sense  that  it 
reflects  not  only  the  frequency  of  reversals  in  the  hypothesized  "cor¬ 
rect"  order  but  it  also  reflects  the  severity  of  particular  reversals 
through  differential  weighting  of  each  type  of  ordering.  A  representa¬ 
tive  sample  of  fifty-two  raters  appraised  one-hundred  forty  seven  ratees 
on  each  of  the  three  behavioral  statements  for  each  of  the  11  dimensions. 
The  statements  were  randomly  ordered  under  their  dimension.  All  the  sub¬ 
jects  were  asked  to  base  their  ratings  on  rate*  behaviors  and  half  of 
them  were  asked  to  record  these. 

The  critical  findings  relating  to  supervisory  ratings  were:  1)  As 
Table  2  indicates,  several  of  the  scalability  coefficients  were  un¬ 
acceptably  low;  2)  Of  the  total  number  of  ratings  made,  26.7%  were 
reversals  (i.e.  errors)  which  could  be  attributed  to  the  rater  (as 
opposed  to  scale  error) :  and  3)  The  behavioral  statements  given  were 
unacceptable.  The  rater  error  statistic  includes  reversals  which  could 
reasonably  be  attributed  to  rater  inattention  rather  than  an  incorrect 
ordering  uf  the  behavioral  statements.  Anecdotal  evidence  indicated 
that  the  supervisor  sample  had  neither  the  time  nor  the  complete  know¬ 
ledge  of  subordinate  behaviors  to  enable  them  to  supply  behavioral 
statements  and  that  lack  of  time  and  motivation  had  led  to  much  of  the 
rater  error.  Thus  it  was  decided  that  a  self-assessment  approach  using 
essentially  the  same  format  could  reduce  the  type  of  error  found  in  the 
supervisory  ratings  because  the  raters  would  be  more  knowledgeable  of 
their  own  behavior  ».nd  motivated  to  seek  new  positions  for  themselves. 


The  Self-Assessment  Instrument 

A  sample  of  62  personnel  specialists  satisfactorily  completed  the 
self-assessment  form  which  was  essentially  unchanged  from  the  super¬ 
visory  format  excepting  the  instructions  (Figure  2).  Comparative  data 
on  the  supervisory  and  self-assessment  ratings  are  given  in  Table  2. 

The  sealabilitycoeff icients  shown  represent  the  proportional  difference 
between  randomly  ordered  ratings  and  ratings  which  follow  perfectly  the 
hypothesised  continuum  of  effectiveness.  Since  all  the  scalability 
coefficients  except  two  represent  significant  improvements  over  chance, 
it  is  more  meaningful  to  look  at  the  trend  across  the  dimensions  from  the 


supervisory  to  the  self-asseaement  scales.  It  can  be  seen  that  there  is 
improvement  in  every  case  but  one  and  in  so mo  cases  the  improvement  is 
considerable.  The  behavioral  statement*  were  significantly  improved 
both  in  quantity  and  depth  of  coverage. 

The  most  important  finding  was  that  the  criteria  used  in  developing 
the  scales  were  related  to  the  level  of  the  scalability  coefficients 
found  for  the  salt-assessment  but  not  for  the  supervisory  assessments. 
That  is,  in  constructing  the  assessment  scaler,,  attention  was  paid  to 
selection  of  the  behavioral  statements  with  minimum  standard  deviations 
and  means  equally  spaced  along  the  scale  of  one  to  seven. 

It  is  clear  from  a  comparison  of  the  Table  1  means  and  standard 
deviations  that  the  scale  construction  was  reasonably  successful  except 
in  maintaining  the  appropriate  distance  between  the  means  of  the  be¬ 
havioral  statements  representing  the  highly-effective  and  moderately 
effective  points  on  the  scales  for  the  eleven  dimensions.  The  closer  to¬ 
gether  the  means  were  placed,  the  more  likely  it  would  be  that  these  two 
levels  would  be  confused  and  that  a  reversal  would  be  stade  in  the  rating. 
It  was  therefore  hypothesised  that  the  scalability  coefficients  (which 
reflect  errors)  would  be  related  to  the  distance  between  these  two  means, 
if  the  scale  were  the  cause  of  the  errors,  and  that  the  scalability 
coefficients  would  be  unrelated  to  the  other  scale  construction  criteria. 
Table  3  shows  that  this  prediction  is  borne  out  for  the  self-assesssient 
data  and  not  for  the  supervisory  data.  This  suggests  that  the  error  in 
the  self-assessment  scales  can  be  decreased  through  improvements  in  con¬ 
struction  in  the  scale  where  the  supervisory  instrument  could  not. 


Second  Phase 

It  is  now  planned  to  put  the  self-assessment  instrument  into  opera¬ 
tional  use  for  one  year.  During  this  time,  thousands  of  personnal 
specialists  will  be  filling  out  self- assessment  forms  and  at  the  same 
time  contributing  behavioral  statements  for  further  development  of  the 
instruments.  These  behavioral  statements  will  provide  a  comprehensive 
sampling  of  the  behavioral  domain.  The  categorisation  and  scaling  pro¬ 
cess  can  then  be  repeated.  It  is  expected  that  the  behavioral  statements 
derived  from  this  large  sample  will  allow  more  complete  definitions  of 
the  present  eleven  dimensions  and  that  ax>re  dimensions  can  be  developed. 
At  the  end  of  an  operational  year,  the  self-assessment  form  will  be  re¬ 
vised. 
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Table  3 


Correlational  Data  Relating  scale  Construction 
Criteria  and  Coefficients  on  Scalability 

Scale  Construction  Criteria _ form 


Supervisory 

Self-Assessment 

1.  Sm  of  scale  standard  deviations 

.24 

-.13 

2.  High  anchor  scale  minus  low  anchor 

scale  mean 

-.44 

-.07 

3.  low  Oil ChvtC  acaiw  aw* n 

-.33 

.35 

4.  Moderate  anchor  scale  aean 

-.36 

-.09 

5.  High  anchor  scale  aean 

-.23 

.49 

6.  Moderate  anchor  scale  aean  minus 
low  anchor  scale  aean 

-.46 

-.37 

7.  High  anchor  scale  man  min^s 
aoderate  anchor  scale  mean1 

-.01 

.63* 

♦p  -  .05 

*It  was  hypothesised  that  the  Majority  of  variance  in  errors  across 
the  dimensions  would  be  related  to  this  swasure  and  that  it  would 
therefore  correlate  significantly  with  the  scalability  coefficients. 
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Discussion 


This  projsct  was  referred  to  as  multi-phase  because  a  continuing 
feedback  and  revision  process  is  envisioned  in  which  the  behavioral 
statastents  which  are  generated  as  part  of  the  assessment  process  can  be 
analysed  as  reflections  of  the  nature  of  personnel  work  in  the  Federal 
government  and,  if  changes  are  seen  in  the  type  of  behavioral  statements 
being  received,  the  instruments  can  be  revised  accordingly.  The  dimen¬ 
sionality  of  personnel  work  can  be  examined  across -and -within  organisa¬ 
tions  by  factor-analytic  or  multi-dimensional  scaling  procedures. 

The  data  which  will  be  collected  has  implications  beyond  its  use 
for  the  assessment  system.  As  Blood  (1974)  has  discussed,  the  data  can 
be  Mae  available  for  training  purposes,  tOt  organisational  diagnoses, 
(e.g.  for  ascertaining  how  different  organisational  revels  view  the  work 
content),  and  for  job  analysis.  The  real  value  of  the  whole  process 
must  lie  in  the  on-going  nature  of  the  assessment  procedures,  in  the 
built-in  feedback  and,  consequentially,  in  our  increased  knowledge  of 
the  performance  appraisal  process  as  one  of  many  inter-related  organisa¬ 
tional  systems  which  continuously  affect,  and  are  affected  by.  one 
another. 
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EFFICACY  OF  CERTAIN  MEASURES 
IN  PREDICTING  ARMY  OFFICER  PERFORMANCE 


Arthur  C.  F.  Gilbert,  Ph.D. 

U.S.  Army  Research  Institute  for  the  Behavioral  and  Social  Sciences* 

Alexandria,  Virginia  22333 


The  leadership  research  program  in  the  Army  Research  Institute  for 
the  Behavioral  and  Social  Sciences  (ARI)  has  included  work  in  two  areas 
related  to  this  paper.  One  of  these  areas  has  involved  the  identification 
”f  leadership  styles  (Hel’se,  Willemln,  &  Grafton,  1971)  while  the  other 
area  Involves  research  on  the  validation  of  associate  ratings  (Psrrish  & 
Drucker,  19S7,  Haggerty,  1963;  Gordon  &  Medland,  1965;  Downey,  Medland  6 
Yates,  1976).  These  two  broad  research  domains  form  the  basis  for  the 
design  of  this  research. 

The  objective  of  this  research  was  to  evaluate  the  efficacy  of  cer¬ 
tain  measures  obtained  In  the  Officer  Basic  Course  in  predicting  subse¬ 
quent  on-the-job  first  duty  tour  perfornance  with  particular  emphasis 
on  the  value  of  final  course  peer  ratings  in  the  prediction  scheme  as 
compared  with  the  other  predictor  variables. 

The  predictor  variables  were  the  seven  sub-scales  of  the  Officer 
Evaluation  Battery  (OEB) ,  peer  ratings  obtained  at  mid  course,  final 
course  peer  ratings,  and  the  final  course  grade  obtained  in  OBC.  The 
seven  scales  of  the  OEB  are  Combat  Leadership  (Cognitive),  Technical- 
Managerial  Leadership  (Cognitive),  Career  Potential  (Cognitive),  Combat 
Leadership  (Non-Cognltive) ,  Technical  Managerial  Leadership  (Non-Cogni- 
tlve).  Career  Potential  (Non-Cognitive) ,  and  Career  Intent.  On-the-job 
performance  measures  consisted  of  a  special  purpose  Performance  Evalua¬ 
tion  Form  and  a  weighted  average  of  Officer  Efficiency  Report  (OER) 
ratings. 


The  views  expressed  in  this  paper  are  those  or  the  author  and  do  not 
necessarily  reflect  the  views  of  the  Army  Research  Institute  or  the 
Department  of  the  Army. 


PROCEDURE 


Dot*  Collection 


1 


% 


All  officers  In  ths  13  Career  Branches  who  attended  the  Officer  | 

Basic  Course  In  Fiscal  Year  '$74  were  administered  the  Officer  Evaluation  I 

Battery.  Peer  ratings  were  obtained  at  tha  aid-point  of  the  OBC  and  -  I 

again  at  the  end  of  the  course.  Final  course  grades  were  obtained  froa  | 

each  OBC  in  either  actual  grades  or  In  class  standing  within  each  OBC  • 

class  or  both.  ,  $ 

1  * 

jjj 

the  Performance  Evaluation  Form  has  bean  described  in  detail  alas-  I 

whore  (Gilbert,  1975;  Gilbert  and  Grafton,  1976;  Gilbert,  Hooper,  & 

Hicks,  1977).  Essentially  this  instrument  was  designed  to  yield  e 

measure  of  overall  duty  performance  and  rankings  end  ratings  of  potential 

performance  along  a  number  of  leadership  dimensions.  Five  of  these  leader-  f 

ship  dimensions  correspond  to  factors  derived  by  Helme,  Wlllemln,  and 

Craften  (1971)  and  tha  factors  of  consideration  and  Initiation  cf  a true-  j 

cure  identified  by  Plelshman  (1974)  and  Stogdill  (1974).  In  addition,  J 

the  form  required  ratings  along  the  two  nore  global  dimensions  of  combat  i 

leadership  snd  technical-managerial  leadership  identified  by  the  Helswi,  « 

Wlllemln,  and  Grafton  research.  In  Figure  1,  the  dimensions  assessed 

by  the  Performance  Evaluation  Form  are  shown  with  the  corresponding 

scale  of  the  Performance  Evaluation  Form,  and  the  abbreviated  title  of 

eech  scale.  A  seven  point  acele  adapted  from  Wlllemln  (1965)  shown  In 

Figure  2  was  used  for  eech  rating.  Raters  were  required  to  rank  seven 

of  the  scales  in  terms  of  this  officer's  potential  for  future  performance 

and  then  provide  ratings  in  there  areas.  Three  of  tho  scale*,  duty  > 

performance,  combat  leadership,  snd  technlcal-managevlal  leadership 

required  ratings  only. 

Ratlnga  on  the  Performance  Evaluation  Form  were  obtained  from  four 
raters  at  far  ae  possible.  Ratings  were  requested  from  tho  officer's 
immediate  supervision,  from  a  superior  officer  other  than  the  officer's  ; 

immediate  supervisor  but  not  necessarily  the  O'.R  indorsing  official,  and 
from  each  of  two  close  associates. 


Pets  Preparation 

The  Officer  Basic  Course  grades  and  class  standings  vjre  equated  by 
ranking  the  grades  of  those  officer's  for  whom  only  class  grades  were 
available  within  the  OBC  clast  of  whic!,  he  was  a  member.  These  rankings 
were  then  converted  to  standard  scores.  Where  rankings  were  available 
they  were  converted  to  etandard  scores  within  the  different  OBC  classes. 
Scores  were  standardised  with  s  moan  of  100  and  a  standard  deviation  of 
20. 
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PROCEDURE 


Data  Collection 


All  officer*  in  Che  13  Career  Branch**  who  attended  the  Officer 
Basic  Course  In  Fiscal  Year  1974  were  administered  the  Officer  Evaluation 
Battery*  Peer  ratings  were  obtained  at  the  aid-point  of  the  OBC  and 
again  at  the  end  of  the  course.  FinaL  courae  grades  were  obtained  froa 
each  OBC  in  either  actual  grades  or  in  class  standing  within  each  OBC 
class  or  both. 

The  Performance  Evaluation  Form  has  been  described  in  detail  else¬ 
where  (Gilbert,  1975;  Gilbert  and  Crafton,  1976;  Gilbart,  Hooper,  6 
Kicks,  1977).  Essentially  this  Instrument  was  designed  to  yield  e 
measure  of  overall  duty  performance  and  rankings  and  ratings  of  potential 
performance  along  a  number  ot  leadership  dlmsnslons.  Five  of  thass  leader¬ 
ship  dimensions  correspond  to  factors  derived  by  Heian,  Wlllemln,  and 
Grafton  (1971)  and  the  factors  of  consideration  and  initiation  cf  struc¬ 
ture  identified  by  Fleiehman  (1974)  and  Stogdlll  (1974).  In  addition, 
the  fora  required  ratings  along  the  two  more  global  dimensions  of  combat 
leadership  and  technical-managerial  leadership  Identified  by  the  Helm*, 
Wlllemln,  end  Grafton  research.  In  Figure  1,  the  dimensions  assessed 
by  the  Performance  Evaluation  Form  are  shown  with  the  corresponding 
scale  of  tha  Performance  Evaluation  Form,  and  the  abbreviated  title  of 
each  scale.  A  seven  point  scale  adapted  froa  Wlllemln  (1965)  shown  In 
Figure  2  wea  used  for  each  rating.  Ratara  were  required  to  rank  saven 
of  the  scales  In  terms  of  this  officer's  potential  for  future  performance 
and  then  provide  ratings  In  there  areas.  Three  of  tho  scale*,  duty 
performance,  combat  leadership,  and  technical-manege; ial  leadership 
required  retlngs  only. 

Ratings  on  the  Performance  Evaluation  Form  were  obtained  from  four 
raters  as  far  as  possible.  Ratings  were  requested  froa  tho  officer's 
Immediate  supervision,  from  a  superior  officer  other  than  the  officer's 
Immediate  supervisor  but  not  necessarily  the  or.R  Indorsing  official,  and 
from  each  of  two  close  associates. 


Data  Preparation 

The  Officer  Basic  Course  grades  and  class  standings  were  equnted  by 
ranking  the  grades  of  those  officer's  for  whom  only  class  grades  were 
evallable  within  the  OBC  class  of  whlcL  he  was  a  member.  These  rankings 
were  then  converted  to  standard  scoras.  Where  rankings  were  available 
they  were  converted  to  standard  scoras  within  the  dlfferant  OBC  classes. 
Scores  were  standardised  with  a  moan  of  100  and  a  standard  deviation  of 
20. 
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Figure  2 


OFFICER  PERFORMANCE  SCALE ’ 


Scale  Value  Description 


7.  OUTSTANDING:  Far  above  the  requireae-nta  of  the  situation,  suggesting 
the  highest  kind  of  formal  recognition  through  meritorious  award,  or 
decoration. 

SUPERIOR:  Markedly  above  the  requirements  of  the  situation,  suggest* 
ing  formal  recotnltion  through  a  special  (favorable)  efficiency 
report,  or  letter  of  commendation. 

ABOVE  AVERAGE:  Somewhat  above  the  requlrmsnta  of  the  situation, 
suggesting  informal  recotnltion  through  specific  favorable  comment 
in  his  regular  efficiency  report,  and  through  informal  appreciation 
or  coamendation. 

6.  AVERAGE:  Fully  up  to  the  requirements  of  the  situation,  suggesting 
seneral  appreciation  (Perhaps  mostly  unexpressed). 

3.  BELOW  AVERAGE:  Somewhat  below  the  requirements  of  the  situation, 
though  suggesting  only  the  mildest  kind  cf  corrective  action  through 
lnfonsal  recommendation  for  improvement,  or  through  change  of  duty 
assignment  within  thw  organisation. 

2.  MARGINAL:  Markedly  below  the  requirements  of  the  situation, 

suggesting  formal  corrective  action  through  a  special  (unfavorable) 
efficiency  report,  administrative  admonition,  letter  or  reprimand, 
summary  court,  or  transfer  out  of  the  organisation. 

1.  UNSAT l SFACTORY :  far  below  the  requirements  of  the  situation, 
suggesting  the  most  drastic  kind  of  formal  corrective  action 
through  reclasslf lcatlon,  demotion,  general  court,  or  boarding 
out  of  thu  Army. 


'  Adapted  from  Wlllemln  (1965) 


359 


Committed  Performance  Evaluation  Forms  were  edited  to  insure 
compatibility  of  ranklnge  snd  ratings.  His  ranking  of  an  officer's  ercs 
of  greatest  potential  performance  should  have  had  a  ranking  of  ”1"  and 
hie  weakest  area  should  have  had  a  ranking  of  "7“.  The  highest  rating 
should  be  given  to  his  strongest  area  and  a  lower  or  equivalent  rating 
should  be  assigned  to  his  next  strongest  area  and  jo  forth.  Here.  It 
should  be  pointed  out  that  the  rating  system  permitted  the  rater  to 
give  the  Mae  rating  on  the  scalos  of  the  form.  Where  discrepancies 
were  encountered  between  the  ratings  and  rankings,  the  ratings  or  the 
rankings  wore  corrected. 


RESULTS  AND  DISCUSSION 

Reliability  estimates  for  each  of  the  ten  scales  of  the  Performance 
Evaluation  Form  are  shown  in  Tsblw  l.  These  estlmateu  are  based  on  cases 
for  which  aU  four  ratings  were  available  In  the  13  Career  Drenches. 
Estimates  were  obtained  Yy  averaging  the  six  possible  correlations  asung 
the  four  raters  and  adjusting  the  resulting  average  by  the  Spearman-Brows 
Prophecy  Formula. 

The  reliability  estlnates  ranged  from  .70  for  the  Combat  Leadership 
Scale  to  .55  for  the  Logistical  Knowledge  Scale.  These  estimates 
were  reported  by  Gilbert  (1977)  and  support  the  findings  of  Wlllemln 
0  965). 

The  correlations  between  the  ten  predictor  variables  and  each  of  the 
ten  scales  of  the  Performance  Evaluation  are  shown  In  Table  2  for  the 
entire  sample.  Also,  the  multiple  correlations  between  the  ten  ptedlctor 
variables  are  shown  In  the  same  table,  examination  of  this  table  reveals 
that  final  course  f-/e*  ratings  yielded  the  highest  aero  order  corre¬ 
lations  with  eight  of  the  scales  the  two  exceptions  being  the  Technical  - 
Managerial  scale  and  the  Tactical  Knowledge  scale.  Final  course  peer 
ratlnga  were  moat  predictive  of  overall  duty  performance,  combat  leader- 
ship,  and  making  decisions. 

In  Table  3,  the  correlations  among  the  same  set  of  variables 
snd  corresponding  multiple  correlations  are  shown  for  the  Combat  Arms 
Branches.  These  branches  are  Air  Defense,  Armor,  Field  Artillery,  and 
Infantry.  In  this  analysis,  final  course  peer  ratings  yielded  higher 
or  equal  sero  order  correlations  with  the  criteria  as  did  other  predic¬ 
tors  In  all  but  two  Instances  these  being  the  Technical-Manager isl  Scale 
snd  the  Logistical  'knowledge  seals. 

The  correlations  between  each  set  of  predictors  snd  each  of  tha  ten 
criteria  snd  corresponding  multiple  correlations  ars  shown  in  Table  4  for 
the  branches  other  than  the  combat  arms  branches.  Here  for  all  of  the 
Ciiterls,  with  the  exception  of  the  Technical-Managerial  scale,  the  sero 
order  correlations  between  final  course  peer  ratings  snd  the  criteria 
were  higher  or  equal  to  the  sero  order  corrections  between  each  of  the 
other  variables  snd  each  of  the  criteria. 
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Table  1 


Reliability  Estimates  for  Each 
Scale  of  the  Performance  Evaluation  Fora 


Reliability 


Scale 

Estimate 

ftity  Performance 

.67 

Combat  Leadership 

.70 

Technical-Manager lal  Leadership 

.58 

Tactical  Knowledge 

.68 

Understanding  Mission 

.59 

Making  Decisions 

.66 

Defining  Functional  Roles 

.58 

Planning  and  Organising 

.57 

Motivating  Troops 

.60 

Logistical  Knowl idge 

.55 
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^Predictor  CLC  »  Coebat  Lead.  (Cog)  CPNC  -  Career  Pot  (Non-Cog V  ^Significant  at  the  .65  level 

TMC  -  Tech-Manag.  Lead. (Cog)  Cl  *  Career  Intent  **Signif leant  at  the  .01  level 

CPC  “  Career  Pot.  (Cog)  OBCC  ~  OBC  Course  Crade 

CLNC  -Tech-Harsag.  Lead.  (Non-Cog)  PRM  »  Hid  Career  Peer  Ratings 

TMNC-  Tech-Kanag.  Lead.  (Non-Cog)  PRF  •  Final  Peer  Ratings 


Correlation  Between  Each  Predictor  and  v'ach  Scale  of  the  Performance 
Evaluation  Form  and  Corresponding  Multiple  Correlation  for  the 


Tech-Kanag.  Lead.  (Cog)  Cl  -  Career  Intent  **Stgnifr  leant  at  the  .01  level 

Career  Pot.  (Cog)  OBCG  -  OBC  Course  Grade 

Tech-Kanag.  Lead.  (Non -Cog)  PKH  -  MM  Career  Peer  Ratings 
-Tech-Kanag .  Lead.  (Non-Cog)  PRF  -  Final  Peer  Ratings 


In  general,  the  resuite  of  these  analyses  indicate  that  final  course 
peer  ratings  obtained  in  the  Officer  Basic  Course  are  the  best  predictor 
of  duty  performance  when  measured  by  the  duty  performance  scale  of  the 
Performance  Evaluation  Form  for  the  total  sample.  The  zero-order  corre¬ 
lation  in  this  instance  is  .26  between  final  course  peer  ratings  and  duty 
performance  while  the  next  highest  sero-order  correlation  is  .19  between 
the  same  criterion  and  OBC  final  grades.  In  tne  Combat  Arms  Branches 
there  is  little  difference  between  the  predictive  value  of  peer  ratings 
and  OBC  final  grades  when  overall  duty  performance  is  concerned.  However, 
for  branches  other  than  the  Combat  Arms  the  correlation  between  final 
course  ratings  and  overall  performance  is  .30  while  that  of  grades  and 
the  criterion  is  .16. 

the  last  analyser  involved  the  relationships  between  the  predictors 
and  the  weighted  average  Officer  Efficiency  Report  (OEX)  ratings.  Tne 
results  of  these  analyses  are  shown  in  Tsble  5.  For  the  total  sample 
final  course  peer  ratings  yielded  a  correlation  of  .21  with  OER  ratings 
but  the  final  grades  obtained  in  the  Officer  Basic  Course  (OBC)  yield 
a  correlation  of  .20.  In  the  Combat  Arms  Branches  OBC  final  course 
grades  yielded  a  slightly  higher  zero  order  correlation  with  OER  ratings 
while  in  the  branches  other  than  Combat  Arms  the  reverse  obtained. 

The  results  of  this  research  are  similar  to  that  reported  elsewhere 
on  the  utility  of  associate  ratings  or  peer  ratings  in  predicting  subse¬ 
quent  performance  (Heine,  1965;  Gilbert,  1975;  Gilbert  and  Downey,  1977). 
Further  research  will  seek  to  explore  how  the  predictive  utility  of  peer 
ratings  may  be  enhanced.  Possibilities  in  this  regrrd  is  to  divide  the 
sample  into  individual  career  branches  since  some  differences  in  predic¬ 
tive  power  between  Combat  Arms  branches  and  the  other  branches  were 
observed  in  this  research.  Another  possibility  along  this  line  is  to 
divide  the  sample  according  to  the  similarities  of  specialties  in  which 
the  officers  are  engaged. 
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Table  5 


Correlation  Between  Each  Predictor  and  Weighted  Average 
Officer  Efficiency  Ratings  and  Corresponding  Multiple  Correlations 
for  the  Combat  Ann,  Branches 
Other  Than  Combat  Arms,  and  for  the  Total  Sample 


Predictor 

Combat 

Arms 

(N  -2.486) 

Branches  Oth<sr  Than 
Combat  Arms 
(N  -2120) 

Totsl 

(N  -  4.506) 

Combat  Leadership 
(Cognitive) 

.06 

.05 

.06 

Technical-Managerial 
Leadership  (Cognitive) 

.02 

-.01 

.01 

Career  Potential  (Cognitive) 

.01 

.04 

.02A 

Combat  Leadership 
(Non-Cognit ive) 

.08* 

.10** 

.09** 

Technical-Managerial 
Leadership  (Non- 
Cognlt  ivo) 

.10** 

.09** 

.10** 

Career  Potential 
(Non-Cognlt ivc) 

.02 

.01 

.02 

Career  Intent 

.02 

.06 

.04 

OBC  Grades 

.25** 

.15** 

.20** 

Mid-Course  Peer  Ratings 

.12** 

.11** 

.11** 

Final  Peer  Ratings 

.22** 

.21** 

.21** 

Multiple  Correlations 

.30** 

.  24** 

.  26  ** 

*Signif leant  at  the  .05  level, 

**Slgnif leant  at  the  .01  level. 
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USING  AN  ASSESSMENT  CENTER  TO  PREDICT  FIELD  LEADERSHIP 
PERFORMANCE  OF  ARM!  OFFICERS  AND  NCOS 

Frederick  N.  Dyer 
Riohsrd  E.  Hilligoss 
Army  Research  Institute  Field  Unit 
Fort  Banning,  Georgia 

INTRODUCTION 

The  sssessaent  oenter  oonoept  involves  the  isusersior  of 
individuals  into  situations  whieh  simulate  those  he  would  faoe  if 
he  were  seleoted  for  entry  or  promotion.  It  has  been  widely  used 
in  Industry  and  business  to  select  personnel  for  high  level 
positions.  In  1973-197H  the  U.S.  Amy  Infantry  School  (USAIS) 
Assessment  Center  (ACTR)  assessed  students  from  the  Infantry 
Offloer  Advanoed  Course  (IOAC),  the  Infantry  Offioer  Besic  Course 
(IOBC)  and  the  Advanoed  NCO  Educational  System  (ANCOKS)  to 
determine  the  feasibility  of  the  assessment  oenter  oonoept  as  a 
leadership  development  and  leadership  prediction  technique.  It 
also  assessed  students  from  the  Branoh  Immaterial  Offloer 
Candidate  Course  (BIOCC)  to  determine  thw  feasibility  of  the 
assessment  oenter  oonoept  as  a  selection  device.  The  purpose  of 
the  present  paper  is  to  disouss  the  effectiveness  of  the  ACTR  for 
predioting  field  leadership  performance. 

ACTR  DESCRIPTION 

Table  1  presents  a  summary  of  assersee  characteristics  and 
group  sixes.  Assessees  reported  to  Fort  Banning  one  week  before 
their  sohedul.»d  USAIS  course  to  participate  in  the  assessment 
oenter.  Day  1  was  sign  in,  room  assignment,  seal  arrangements, 
etc;  Day  II,  Day  III  and  1/2  of  Day  IV  were  given  to  the 
assessment  prooess,  providing  2  1/2  days  of  assessment.  Two  pilot 
sessions  were  completed  in  June-July  1973,  with  the  first  regular 
session  beginning  11  July  1973. 

The  assessor  pool  consisted  of  six  Majors,  seven  Captains, 
two  Lieutenants,  three  Master  Sergeants,  two  Sergeants  First 
Claes,  and  one  Staff  Sergeant.  The  assessors  were  seleoted  by  DA 
using  the  following  criteria:  each  man  must  be  in  one  of  the 
oombat  arms,  each  Captain  and  above  must  have  had  command 
experience;  each  Major,  Captain,  and  Sergeant  must  have  served  in 
combat,  and  Officers  must  have  had  an  advanced  degree  in  one  of 
the  behavioral  sciences. 
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Tab la  1 

assessed  awr  chaeacteeistics  and  sizes 


Daacrigtor 

rose 

ASSESS*** 

KMC 

380tf? 

RIOCC(OCS) 

ANCOIS 

Muabar  AiMiMd 

90 

SI 

MS 

87 

Kunhar  with  eoaplata 
6-aon th  ratlaga 

AS 

36 

AO 

58 

Fay  CrxM 

0  1 

0  3 

i  H 

aa  *  « 

Avaraga  Aga 

22.6 

28.8 

25.  S 

55.3 

Avaraga  yaara  of 
Acetva  Duty 

0.5 

5.7 

S.S 

12.9 
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LEADERSHIP  DIMENSIONS 

The  stuff  of  the  assessment  center  and  Amy  Research 
Inatituta  and  HumRRO  soiantlsta  ldantiflad  lists  of  behavioral 
dimensions  characteristic  of  effective  leadership.  The  basis  of 
this  selsotion  was  the  Ohio  State  Leadership  Study,  The  Amy  Mar 
College  report,  Lt«ti*r*falP  ftrJdtti  Ifl'J,  «nd  other  prominent 
studies  on  leadership. 

The  ten  dimensions  of  leader  behavior  judged  to  be 
appropriate  for  the  assigned  mission  and  whioh  it  was  believed 
could  be  evaluated  using  the  assessment  oenter  ooncept  were: 

Adaptability 
Administrative  Skills 
Communication  Skills 
Deoision  Making 
Poroe fulness 
Mental  Ability 
Motivation 

Effectiveness  in  an  Organizational  Leadership  Role 
Sooial  Skills 
Supervisory  Skills 

ASSESSMENT  CENTER  EXERCISES 

The  staff  then  began  the  selection  and  construction  of 
exerolses  and  questionnaires.  In  evaluating  possible  exercises 
and  exercise  concepts,  a  baslo  factor  of  consideration  was  that 
the  exerolses  would  place  ;he  assessees  in  uniquely  different 
situations  while  simultaneously  providing  multiple  opportunities 
for  the  evaluation  of  each  dimension.  Exerolses  were  ultimately 
selected  based  upon  their  situational  diversity,  military 
relevance  and  apparent  potential  for  eliciting  behaviors  related 
to  the  designated  dimensions.  The  following  battery  of  exercise*) 
were  selected : 

Entry  Interview:  A  background  Interview  to  eliolt  information 
related  to  motivation,  experience  and  the  assesses' s  self- 
knowledge  of  his  strengths  and  weaknesses  ( 90  * ) . 

Appraisal  InUatilM?  An  applied  exercise  in  whioh  eaoh 
asaessee  interviewed  two  others  to  select  one  for  a  position 
within  a  battalion.  This  Interview  elicited  behaviors  related  to 
communication  skills,  social  interaction  and  organisation  of 
thought  (100*). 
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Leaderless  Group  D1 sousa ion:  This  exercise  ms  «  ooablned 
Individual  and  group  task  in  whioh  6  IOAC  asssssaes  wr*  asslgnsd 
a  Mission  to  distribute  year  and  funds  aaong  the  represented 
diraotoratas  while  atteapting  to  aoqulra  a  aaxlaur  acount  for  his 
own  diraotorata.  IOBC,  BIOCC,  and  ARCOBS  assassaas  ware  assigned 
a  Mission  to  gat  a  soldier  froa  their  unit  salaotad  as  tba  Brigade 
Soldier  of  the  Month  and  providing  a  rank  order  of  narit  list  of 
the  available  candidates.  This  exeroise  ellolted  behaviors 
asaoelated  with  force fulness,  persuasiveness,  organisational 
ability  and  group  interaction  (120*). 

In-Basket  Exeroise  (Three  versions:  IOAC  -  aaaessee  ms  plaoed 
in  the  role  of  a  battalion  coaaander;  IOBC/BIOCC  -  assessee  ms 
placed  in  the  role  of  a  ooapany  ooaaander;  ARC0E3  -  assessee  ms 
plaoed  in  the  role  of  a  1st  Sergeant).  An  in-basket  containing 
nany  lteas  typical  of  the  appropriate  position  ms  presented  to 
the  assessee  who  had  3  hours  to  address  eao'u  itsa  in  the  in- 
basket.  This  exercise  ellolted  behaviors  relating  to  problen 
solving,  decision  asking,  work  organisation  and  leadership.  It 
ms  followed  by  an  Interview  to  discuss  reasons  for  action  taken 
and  the  relationship  perceived  to  exist  aaong  sons  of  the  notions 
( 180* ) . 

Mar  Qaae  (IOAC  assessees  only):  This  ms  an  aaslgned-role 
rotating  leader  exeroise  oonduoted  in  two  2-hour  sessions.  Tesas 
of  6  players  engaged  in  cost  effectiveness  analysis  in  a  Military 
force  planning  environaent.  Total  oosls,  BAD,  intelligence 
acquisition,  balanced  offensive/defensive  forces  were  all 
considered  under  Halted  budget  and  tlae  constraints.  This 
exercise  eliolted  organisational  and  leadership  behavior  (2*0'). 

Radio  S Isolate  (Three  versions:  IOAC  assessees  were  plaoed  ir. 
ooapany  coaaander  role;  X06C/BI0CC  assessees  were  plaoed  in  a 
platoon  leader  role  during  a  civilian  eaergenoy  situation  to 
insure  that  lack  of  Military  experience  did  not  preclude  thea  froa 
participation  in  the  exercises;  ARCOES  assessees  were  plaoed  in 
the  role  of  aotlng  platoon  Issuers).  It  ms  a  5-hour  exeroise 
using  radios  as  the  only  Means  of  ooaaunioatlon.  It  ellolted 
organisational  and  leadership  behaviors  (300'). 

£11.14  faii*g lit  (IOBC,  BIOCC,  ANC0ES):  This  ms  a  5-hour 
rotating  leader  designated  exercise  involving  a  teaa  of  6 
assessees.  There  were  6  lanes  with  s  different  obstacle  provided 
for  each  lane.  A  separate  alsslon  and  ohoioe  of  Materials  ms 
also  available.  It  elicited  eaergent  leadership,  planning  and 
organisational  behaviors  (300'). 
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HmgflMnt  telCfliat  (,'Conglonerate,,):  This  was  a  two  hour 
exercise  divided  into  two  planning  and  two  trading  perioda.  The 
18-aan  asaesaaent  group  war  organised  into  three  6 -man  groups  who 
ooepeted  agalnat  «iob  other.  Thla  exeroiae  eliolted  behaviors 
relating  to  aaergent  leadership,  aggreaaivenesa  and  aooial 
interaction  (120'). 

MrUlM  SltcaiMi  This  waa  an  exeroiae  designed  to  aeasure 
aoouraoy  of  lnfomation  provided,  graaaar,  spelling,  and 
coapleteness.  The  IOAC  aaaeaaeea  responded  to  a  Staff  Aotlco 
Paper  and  the  other  aaseaaaent  groups  to  a  discharge  sot ion  (60'). 

PSYCHOMETRIC  TESTS  AND  SELF-DESCRIPTION  INSTRUMENTS 

A  survey  of  tests  in  general  was  aade  revealing  sany 
possibilities  for  adoption  into  the  aaseaaaent  progrea.  The 
prlaary  criterion  for  selecting  speoifio  tests  was  relevance  of 
the  variables  vo  be  tested  to  the  diaenslons  of  leadership: 
adai nl strati ve  skills,  ooaaunioatlon  skills,  supervisory  skills, 
foroefulnesa,  adaptability,  decision  waking,  and  aental  ability. 

The  secondary  oriterla  used  in  selecting  tests  were:  non- 
offensive  test  iteas,  suitability  in  content  and  foraat  for  use 
with  aature  adults,  adequaoy  of  noraative  data  and  theoretloal 
discussions,  recenoy  of  publication  or  revision  and  efficiency  in 
test  adainistration. 

Both  cognitive  and  non-oognitlve  tests  were  selected 
specifically  to  (1)  allow  for  the  ooaparlson  of  an  individual 
score  with  noraative  data  and  (2)  verify  the  results  of  other 
aaseaaaent  aeasureaents >  Group  tests  were  selected  in  order  to 
ainlaixe  the  nuaber  of  assessors  and  the  aaount  of  tiae  required 
for  eaoh  asaesaaent.  The  payohoaetrlo  tests  and  self-desorlptlve 
lnatruaents  selected  are  listed  below. 

1.  Leadership  Opinion  Quest ionnaire 

2.  Watson-Glaser  Critical  Thinking  Appraisal 

3.  Nelson- Denny  Reading  Test 

A.  Henaon-Nelaou  Test  of  Mental  Ability 

5.  Leadership  Q-Sort  Test 

6.  Social  Insight  Test  (Chapin) 

7.  Work  Eavironaent  Preference  Sohedule  (Gordon) 

8.  Strong  Vocational  Interest  Blank 

9.  Edwards  Personal  Preference  Schedule 

10.  Person  Description  Plank 


Questionnaires  to  obtain  specific  background  inf  creation 
about  the  assesses,  and  to  rolioit  the  assesses* s  opinion  of  bis 
assessment  experience,  were  also  developed.  The  purpose  of  these 
questionnaires  was  to  assist  in  the  overall  research  effort  and  to 
oollect  suggestions  for  laproving  Assessment  Center  techniques  and 
administration. 

FEEDBACK  TO  ASSESSEBS 

To  provide  an  impetus  for  Individual  self-development  of 
leadership  attributes,  I0AC  and  ANCOES  students  were  provided 
post-assessment  counseling  feedback.  The  purpose  of  this  session 
ws  to  enhance  professional  development  through  identifying  the 
counselee's  leadership  strengths  and  weaknesses  as  observed  at  the 
Assessment  Center.  The  counselor  discussed  with  the  oounselee 
eaoh  behavioral  area  assessed,  described  the  observations  noted  by 
assessors,  and  guided  him  towards  available  and  effective 
materials  that  would  be  benefloial  to  him  in  developing  a  self- 
improvement  program.  The  counselor's  approaoh  was  essentially 
non-direotivo,  as  any  aotlons  for  self-improvement  were  left 
entirely  to  the  oounselee. 

FIELD  LEADERSHIP  PERFORMANCE  RATINGS 

The  leadership  criterion  used  to  validate  the  ACTR  measures 
consisted  of  ratings  of  ten  leadership  dimensions  by  two 
superiors,  two  peers  and  two  subordinates  of  the  assesses.  These 
were  made  six  months  following  the  completion  of  the  assesses' s 
USAIS  course  by  personnel  in  his  new  unit.  The  same  ratings  were 
again  obtained  18  months  following  completion  of  school  although 
fewer  questionnaires  were  returned  at  this  later  period.  Where 
ratings  were  obtained  at  both  periods,  there  was  only  a  lOf 
overlap  in  raters  from  the  first  period  to  the  second. 

The  ten  leadership  dimensions  were  Decision  Making, 
Administrative  Skills,  Interpersonal  Competence  (Social  Skills), 
Communication  Skills,  Supervisory  Skills,  Organ 1 tit Iona 1  Role 
Skills,  Technical  and  Tactioal  Competence,  Leader  Motivation, 
Leader  Adaptability,  and  Leader  Foroefulness .  For  eaoh  dimension 
five  statements  describing  particular  behaviors  were  rated  making 
a  total  of  50  items  on  the  Leadership  Performance  Rating  Form 
(LPRF) . 

Approximately  one-half  of  the  questionnaires  were  returned. 
Complete  rating  data  was  obtained  on  159  of  the  original  *08 
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assessees  at  six  sooths  and  complete  dots  was  obtainsd  on  108 
assssssss  at  six  and  18  sooths. 


The  average  rating  for  all  50  itess  par  questionnaire  and  all 
six  queat.ionnal res  was  calculated  for  six  sonths  and  for  18 
sooths.  The  correlations  between  these  two  averages  ranged  fros 
•5*  for  toe  I0BC  assessees,  through  .68  for  IOAC  assessees,  to  .75 
for  the  ANCOES  assessees.  Only  15  BIOCC  assessees  had  oosplete 
rating  jets  for  6  and  18  sonths  and  the  negative  correlation 
between  six  end  18  sonth  averages  for  this  group  (-.35)  say  have 
been  a  spurious  result.  The  aix-sonth/ 1 8 -sonth  correlation  say  be 
thought  of  as  a  test/retest  reliability.  These  correlations  are 
surprisingly  high  since  sany  factors  oould  operate  to  ohange 
leadership  over  the  12-aonth  period  between  ratings  and  beoauae  of 
the  relatively  short  tlse  for  observation  of  leadership  prior  to 
the  first  ratings  (six  sonths).  Correlations  between  rater  types 
(superior,  peer,  subordinate)  were  also  generally  significant  and 
positive  for  each  rating  period  (for  XOBC,  IOAC,  and  ANCOES 
assessees) . 

Although  these  correlations  indicate  the  orerall  average 
rating  at  a  rating  period  was  highly  reliable,  the  questionnaire 
failed  to  discriminate  among  the  ten  dimensions  thnt  presumably 
were  represented  in  the  fifty  itess.  A  factor  analysis  indicated 
only  one  significant  factor  which  amounted  for  74X  of  this  oosson 
variance.  It  is  not  dear  whether  the  failure  to  dlsorislnate 
among  leadership  dimensions  reflected  on  ratee  performance  or 
whether  the  different  leadership  dimensions  are  as  interdependent 
an  these  high  correlations  indioate. 

Since  auoh  sore  data  was  available  for  the  elx-sonth  rating 
period  (with  wisest  no  18-sonth  data  froo  the  BIOCC  assessees)  and 
since  a  high  correlation  existed  where  such  data  were  available, 
the  average  rating  for  all  300  questions  (six  raters  x  50 
questions)  at  the  six-month  rating  period  was  used  as  the  field 
leadership  criterion  to  validate  *he  ACTR  aeasures. 

RESULTS 

The  aoores  obtained  froa  the  ACTR  fall  into  the  following  six 

classes: 


1.  Assessor  ratings  of  assesses  performance  during 
individual  and  group  formal  exeroises  such  as  the  In-Baswet, 
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2.  Peer  rankings  of  sssssssss  in  those  formal  exercises 
where  a  group  of  sssessees  participated  together  such  as  the 
Assigned  Leader  Group  Exereire, 

3.  Soli '•rankings  by  the  assessee  of  his  ptrforaanoe  relative 
to  other  group  aeabers  in  these  group  exeroises, 

A.  Leadership  diaension  ratings  aade  by  an  assessor  during 
the  Entry  Interview  with  ths  assesses, 

5.  Assessee  performance  on  paper  and  penoil  performance 
tests,  and 

6.  Assessee  self«desorlptions  on  questionnaires  and  other 
instruments  such  as  the  Edwards  Personal  Preference  Sohedule. 

The  results  will  be  disoussed  for  each  of  the  above  olasses 
of  score  and,  following  this,  the  olasses  of  ACTH  soores 
themselves  will  be  discussed  and  compared  on  their  effectiveness 
for  prediction  of  the  field  leadership  ratings  criterion. 
Proportior.e  of  successful  predictors  will  be  oompared  among 
olarves  as  will  the  amount  of  time  required  by  assessors  and 
asaesaeea  to  obtain  each  successful  measure.  The  end  result  will 
be  an  ordering  of  the  different  olasses  of  ACTR  measure  on  their 
util it>  for  predicting  the  criterion. 

1 .  ASSESSOR  RATINGS  OF  ASSESSEE  PERFORMANCE  DURING  FORMAL 
EXERCISES 

Hftdf.rle.at.  .fir.q<ia..g.l»?.y  ga  Lcn 

Assessor  ratings  for  this  exeroise  provided  good  prediotors 
of  the  field  leadership  criterion  for  the  I0BC  assessee  group.  In 
particular,  a  rating  of  "amount  of  negative  social  behavior  shown” 
was  correlated  (r«-.56,  p<«01)  with  the  criterion  indicating  that 
those  sssessees  who  showed  more  negative  sooial  behavior  were  more 
likely  to  be  rated  high  on  field  leadership.  Similarly,  "social 
oonoern*  was  related  to  the  criterion  with  I0BC  assessees  oho 
showed  less  sooial  oonoern  being  more  apt  to  be  rated  high  on 
field  leadership  (r*-.3Y,  p<.01).  One  other  rated  diaension  that 
was  signi f loantly  related  to  the  criterion  for  this  group  was 
"speaking  ability".  IOBC  assessees  who  were  rated  high  on  this 
diaension  were  acre  apt  to  be  rated  high  on  field  leadership 
(r« .28,  p(.05). 
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For  BIOCC  UMWMS,  "aooial  oonoern"  was  significantly 
ralatsd  to  the  criterion  (r«,31,  p{.05)  but.oontrary  to  IOBC,  high 
social  oonoern  was  related  to  good  ratings  on  the  criterion. 
"Asount  of  negative  soolil  behavior”  showed  a  aiailtr  revjraed 
relation  to  the  oriterion  (compared  to  IDDC)  although  the 
correlation  was  not  significant  (r*  24,  p-.06). 

For  AMCOES  aaaeaaeea,  the  Leaderless  Group  Disoussioe 
produced  a  single  significant  relation  with  the  oriterion.  The 
dimension  fcoonveys  information"  was  correlated  negatively  (r«-.32, 
p{.05),  indicating  that  persona  rated  lower  on  this  oasBunioaUon 
skill  dlaension  were  aore  apt  to  be  rated  high  on  the  crltericn. 

As  will  be  shown  throughout  this  section,  poor  perfomanoe  fo; 

NCOs  on  tKa  ACTR  exercises  was  frequently  related  to  higher 
ratings  on  the  oriterion. 

Assessor  ratings  on  the  Leaderless  Group  Disoussion  failed  to 
predict  the  oriterion  for  the  XQAC  assessee  group. 

Only  two  of  the  assessor  ratings  for  this  exeroise  showed 
significant  relationships  with  the  oriterion.  For  the  ICBC 
assessees,  ratings  of  "energy  and  vigor”  were  negatively 
oorrelated  (r*».26,  p(.05)  indicating  that  low  energy  and  vigor 
were  aore  apt  to  be  related  to  high  field  leadership  ratings.  For 
the  BIOCC  assessees,  the  "receptivity"  rating  showed  a  positive 
correlation  with  the  oriterion  (r«.36,  p^.01).  Assessees  who  were 
rated  higher  on  "listening  to  acd  considering  idea?  of  others” 
were  aore  apt  to  reoelve  high  field  leadership  ratings. 

Assessor  ratings  on  the  Conglomerate  Exercise  filled  to 
prediot  Ue  oriterion  for  the  ANCG6S  and  IOAC  groups. 

Assessor  ratings  on  the  Radio  Simulate  exeroise  were  alaost 
coapletely  unrelated  to  the  field  leadership  oriterion.  Only  for 
the  ANCOBS  assessees  were  significant  relations  found  for  ratings 
of  "ooaayoioutlon  skills"  {r*-.27,  p(.05),  "adaptability"  (r*-.28, 
p^.05)  and  "organizational  identification"  p(.C1).  In 

erob  of  theae  oases,  poor  NCO  perfomanoe  on  the  exercise  was 
related  *o  high  oriterion  perfomanoe. 
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Assessor  ratings  on  this  exeroise  shcvsd  significant 
relations  to  the  criterion  for  all  groups  but  the  IOBC  assessees. 
Per  IOaC  captains,  the  field  leadership  criterion  vas  positively 
related  to  good  assessor  ratings  on  "decision  salting"  (r*.33, 
;><.05)  end  a  "use  of  svsilable  inforsstion"  (r*.36*  p(.05).  For 
BIOCC  assessees  hign  oriterlon  ratings  were  related  to  $ood 
perforsanoe  on  "planning  and  organisation"  (r».27,  p{.05)  and 
"task  orientation"  (r*.35,  p(.05). 

All  significant  relationships  between  In-Basket  assessor 
ratings  and  ANCCES  field  leadership  ratings  were  negative.  Good 
oriterlon  ratings  were  related  to  poor  "dlreotlng  ability" 
(r*-,27,  p{.05),  oocr  "task  orientation"  (r*-.37»  p*.01)  and  poor 
•sensitivity*  (r*-.27,  p«.05). 


No  assessor  rating  was  signlfloantly  related  to  the  oriterlon 
for  the  IOBC,  IOAC  and  BIOCC  aasessee  groups  for  this  exeroise. 

Fnv  tho  ANC0B5  assessee  group  only  one  dimension,  "ability  to 
organise"  was  related  (v*-.33«  P<.0b).  The  regatlve  correlation 
indioates  that  poor  "ability  to  organise"  cn  the  exeroise  was 
related  to  good  field  leadership  ratings. 

ifrlUfi*  JSmalai 

Assessor  ratings  on  "aoouraoy  of  written  lnforeation"  were 
significantly  related  to  the  criterion  for  both  the  10%  and  IOAC 
groups  (r«-.27  and  r*-.29,  respectively,  p(.05  for  both).  The 
negative  relationship  indicates  that  ooorer  writing  aoouraoy  was 
related  to  better  field  leadership  ratings,  The  other  significant 
relationship  for  this  exeroise  was  "spelling"  whloh  for  the  AMCOES 
assessees  was  related  positively  to  the  oriterlon  (r>.28,  p(.05). 

Assessor  ratings  on  the  Writing  Exeroise  failed  to  predict 
the  criterion  for  the  BIOCC  group. 

AjJ IJAOM. gr.  Qr.SVtt  SutroUtJAkOfil. 

All  easessee  groups  sxoept  the  IOAC  osptsins  completed  this 
exeroise.  This  sxsroise  was  siwcsasrul  in  prsdioting  ths 
criterion  for  the  ANC0E5  group.  High  assessor  ratings  on  two 
diaensions  were  associated  with  high  field  leadership  ratings. 
These  were  "energent  leadership"  (r«  29,  p(.05)  end  "group 
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facilitation"  (r*.29,  p<.05).  luterestingly ,  thaea  were  ths  two 
diaensions  on  tha  axaroiaaa  that  wara  classed  ti  ‘follower 
behaviors".  Tha  other  significant  ralationahip  indicated  that  low 
aaaaaaor  rating#  on  "flexibility*  wore  associated  with  high  scores 
on  tha  oritarion  (r*-.30,  o{.05). 

Tha  ALOE  aaaaaaor  ratinga  provided  no  significant 
oorralationa  with  th*  criterion  for  th*  reaaining  IOBC  and  BIOCC 

aaaaaaaa  groups. 

iauidacJBiat 

Only  tha  IOAC  captains  partiolpatad  in  this  exeroise  (it  took 
th#  place  of  tha  ALOE  for  this  group)  and  on#  of  tha  assessor- 
rated  diaensions  "flaxibility"  was  successful  iu  predicting  tha 
oritarion.  Assesses*  with  high  flaxibility  ratinga  wara  apt  to  be 
rated  high  on  field  leadership  (r*.36.  p(.02).  Aaong  tha 
nonsignificant  assessor  ratings,  diaensions  of  "leadership", 
"planning",  and  "organisation"  whioh  would  be  axpaotad  to  have 
even  stronger  relations  to  a  leadership  oritarion  did  not  even 
approach  sign if loanee. 

2.  PEER  RANKINGS  ON  GROUP  EXERCISES 

Leader  .as  Group  Discussion 

The  six  group  aeabers  who  participated  in  this  exercise 
ranked  all  six  aeabers  on  a  nueber  of  different  diaensions  at  tha 
and  of  the  exerois*.  No  significant  predictors  of  tha  oritarion 
were  ftund  for  any  cf  the  diaensions  on  whioh  peer  rankings  war* 

aade. 

&?ngl3gtrttt.  ..Swralai 

Siallar  rankings  ware  obtained  frea  group  aeabers 
in  this  exercise  with  siallar  results,  i.a.,  no 
significant  relationships  with  the  criterion  for  any 
assesses  group. 

Here  predictive  validity  was  found  for  pear  rankings  in  this 
exercise.  In  fact  three  of  tha  four  diaensions  provided 
significant  criterion  pradiotors  for  the  ANCOES  assessae  group. 
These  were  "ability  to  lead3  (r«.29,  p^.05),  "quality  cf  leader 
support"  (r*.27,  p(.05),  and  "generating  group  aorala*  (r«.33, 
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p(.05).  These  positive  correlations  indicate  that  high-ranked 
individuals  on  the  exercise  tended  to  receive  the  high  field 
leadership  ratings.  The  only  other  significant  correlation  for 
this  exercise  appeared  for  the  BXOCC  assessee  group  for  a  ranking 
of  "how  much  you  would  like  to  associate  with  thaw  sooially" 

(r*. 30,  p^.05) .  Persons  preferred  for  socialisation  were  sore  apt 
to  be  rated  high  on  the  criterion. 

Lt«4lEJ3UUl 

This  exeroise  did  not  produoe  any  significant  peer  ranking 
correlations  with  the  criterion  for  the  IOAC  assessees  who 
participated  in  it. 

3.  SELF  RANKINGS  ON  GROUP  EXERCISES 

iitg4Trim..SmeJ?iMUiaian 

The  assessee  included  himself  in  the  group  rankings  for  this 
exercise  and  his  self-ranking  was  tested  also  as  a  predictor  of 
the  criterion.  Only  one  of  these  scores  wns  found  to  predict  the 
criterion.  This  was  the  self-ranking  on  "idea  quality"  (r*.32, 
p^.O 5)  for  the  ANCCE3  assessees.  Persons  who  ranked  themselves 
higher  on  this  dimension  were  mere  apt  to  reoelve  high  field 
leadership  ratings. 

Conglomerate 

Three  seif  rankings  were  significantly  asooiated  with  tho 
criterion  on  this  exeroise  for  the  ANCG8S  assesseos.  ihesa  were 
"popularity"  (r».29»  p{.05),  "energetio  support  of  taam  effort" 
(r».3*,  p\.05),  and  "causing  conflict  within  the  group"  (r*.29, 
p{.05).  High  "popularity'',  high  "energetic  support  of  team 
effort"  and  low  "amount  of  oonfliot"  wore  relited  to  high  ratings 
of  field  leadersnlp.  For  the  IOAC  group,  self-rankings  of  'idea 
quality"  war*  ralated  positively  to  the  criterion  <r*.31,  P^-05) 
I06C  and  8I0CC  assessees  did  not  produoe  significant  se) f- ranging 
predictors  for  this  exercise. 

Apateaffij  ..Lti3*r..firgaa  JBEtraUt 

The  AHCQSS  assessee  greup  produced  the  only  significant  self¬ 
ranking  predictors  for  this  exercise.  These  were  for  dimensions 
of  "ability  to  lead"  (r*.32,  p4yQ5)  and  "generating  group  morale" 
(r*.30,  p^.op).  The  positive  correlations  indicate  high  s*lf- 
ranklngs  were  related  to  good  ratings  on  the  field  leadership 
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oriterion.  IOBC  uitsstes  did  not  produoe  significant  seif¬ 
ranking  predictors  and  the  I0AC  did  not  participate  in  this 
exercise. 

Lwiar,  gita 

As  with  peer  rankings,  self  rankings  produced  nc  significant 
correlations  with  the  oriterion  for  the  IOAC  assessees  who  were 
the  only  participants  of  this  exeroise. 

A.  ENTRY  INTERVIEW  PERFORMANCE  EVALUATION 

Six  of  the  1*  soores  of  the  Entry  Inverviow  significantly 
predicted  the  field  leadership  ratings  of  the  BIOCC  assessee 
group.  ifi*se  were  "overall  lapression"  (r*.A2,  p^.01),  "interest 
in  self-developaent"  (r*.28,  p(.05>,  "effectiveness  in  conveying 
information"  (r*. 35,  p^.05),  "derives  satisfaction  fro*  work 
aoooapliahaents”  (r«.31,  p(.05),  "fluent  and  articulate”  (r«.29, 
p{.05),  and  "how  well  he  expresses  his  opinions"  (r«.29»  p(.05). 
These  positive  correlations  indioate  that  good  Entry  Interview 
ratings  were  related  to  good  field  leadership  oriterion  ratings. 

The  ANCOES  assessees  who  were  rated  high  on  "aniatatlon  and 
enthusiasm"  were  auoh  wore  apt  to  receive  high  oriterion  ratings 
than  their  lower-rated  colleagues  (r«.A5,  p(.01).  For  this  group 
•interest  in  self-development"  was  inversely  related  to  the  field 
leadership  ratings  (r*-.29,  p(.05).  The  only  other  significant 
prediotor  froa  the  Entry  Interview  was  for  the  IOBC  group.  As  for 
the  ANCOES  group,  "interest  in  self-developaent"  was  negatively 
correlated  with  field  ratings  of  leadership  (r»-.27,  p(.05). 

5.  PENCIL  AND  PAPER  PERFORMANCE  TESTS 

The  four  tests  that  fall  into  this  category  are  the  Henmon- 
Nelson  Tests  of  Mental  Ability,  the  Watson-Glaser  Critioal 
Thinking  Appraisal,  the  Nelson-Denny  Reading  Test,  and  the  Social 
Insight  Test.  Only  for  the  ANCOES  assesse«  group  did  these 
aeasures  successfully  predict  the  field  leadership  ratings 
crlterlor,  However,  it  is  questionable  to  use  the  tera 
"successfully"  since  paac  performance  on  the  He. mon-Nelson 
Quantitative  (r*-.30,  p^. 05) ;  Henaon-Nelson  Verbal  (r«-.A1, 
p^.01),  Henaon-Nelson  Total  Soore  (r»-.A0,  p(.01);  Nelson  Denny 
Vocabulary  (r*'.36,  p(.05);  Nelson-Denny  Coaprehension  (r*-.32, 
p(.05)  and  Nelson  Denny  Total  (r*-.37,  p(.0*:)  wvrs  related  to  good 
ratings  on  the  field  leadership  oriterion.  Th*.'  Matson  Glaser 
Critical  Thinking  Appraisal  and  The  Social  Insight  Test  showed  no 
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significant  correlations  vith  the  oriterion  for  any  of  the 
assesses  groups. 

6.  SELF-DESCRIPTION  INSTRUMENTS 

MMi.ediJfar»gMl  Prtftrtaw  flatoM*  (Jtf  ESI 

One  of  the  highest  correlations  obtained  with  the  oriterion 
was  from  this  instrument.  IOAC  assessees  with  a  high  "Need  for 
Order”  tended  to  be  rated  higher  on  the  field  ratings  of 
leadership  (ra.52,  p(.001}.  In  addition,  the  IOAC  assessees 
showed  an  Inverse  relationship  between  "Need  for  Suoooranoe"  (to 
have  others  provide  help  when  in  trouble,  to  seek  enoouragaaent 
from  others,  eto.)  and  the  oriterion  (r*-.?5,  p<.05). 

The  ANCOES  assessee  group  also  showed  a  number  of  significant 
correlations  between  EPPS  measures  and  the  oriterion.  "Need  for 
Exhibition"  was  inversely  related  to  the  oriterion  (r«-.31» 
pC05),  and  "Need  for  Abasement"  was  related  positively  (r«>28, 
p(.05).  Finally  soores  on  the  "Consistency"  variable  were  related 
to  the  oriterion:  positively  for  the  IOBC  assessees  (r«.3Q* 
p(.05)  and  negatively  for  the  ANCOES  group  (r«-.3*,  p<.05).  No 
EPPS  measures  were  aigni fluently  related  to  the  oriterion 
performance  of  the  BIOCC  assessees. 

JteriLSnUrottivat  frifirwat^gbedult  (MBPS) 

High  soore*  on  this  measure  "typify  individuals  who  accept 
authority,  who  prefer  to  have  speoifio  rules  and  guidelines  to 
follow,  who  prefer  Impersonal lxed  work  relationships,  and  who  seek 
the  aeourity  of  organizational  and  In-group  Identification.”  Two 
of  the  assessee  groups  showed  significant  correlations  of  their 
soores  on  this  measure  with  their  criterion  field  leadership 
ratings.  IOBC  assessees  who  were  lower  on  the  WEPS  were  more 
likely  to  reoeive  high  oriterion  ratings  (r«-.25,  p(.05)  and  IOAC 
assessees  who  were  higher  on  the  VBPS  were  more  likely  to  reoeive 
high  oriterion  ratings  (r*.32.  p{.05).  The  BIOCC  and  ANCOES 
groups  did  not  have  significant  correlations  with  the  oriterion  on 
this  measure. 

OBlalop  .a^tatloonAin}  ILQfll 

ANCOES  aasessees  scoring  high  on  "Consideration"  on  the  LOQ 
were  more  apt  to  be  rated  high  on  the  oriterion  (r«.36,  p{.05). 
IOBC  cssessees  who  were  high  on  "Structure"  were  more  apt  to  be 


rated  high  on  the  criterion  (r*  25,  p'.05).  No  other  LOQ  soores 
were  significant  for  these  or  for  the  other  aasessee  groups. 

Leadership  Q  Sort  (LOS) 

IOBC  aasessees  showed  a  fairly  strong  relationship  of 
■Decision  Making"  to  the  oriterion  with  the  persona  sooring  low 
on  this  diaenaion  being  sore  apt  to  reoelve  high  leadership 
ratings  (r*-.39,  p(.01).  "Teaching  and  Co— unloatlon"  soores,  on 
the  other  hand  wrre  positively  related  to  the  oriterion  for  the 
IOBC  group  (r».27,  p(.05).  High  soores  on  "Mental  Health"  were 
related  to  high  oriterion  ratings  for  the  ANCOES  s assesses  (r«.33f 
P<.05)  while  low  scores  on  "Personal  Integrity"  were  related  to 
high  oriterion  ratings  for  this  group  (r*- .30,  p(.05). 

IOAC  aasessees  showed  an  Inverse  relation  between 
"Consideration"  soores  and  the  oriterion  (r«-.36,  p(.05).  BIOCC 
assesses  showed  no  significant  relationship  of  LQS  aeasures  to 
the  oriterion. 

Jgiawn  PttdrlBUoa  Blink 

fifty  pairs  of  adjeotlves  were  presented  to  eaoh  aasessee 
(e.g.  WARY:  123*567:  GULLIBLE)  with  Instructions  to  rate 
hiasolf  by  circling  the  mssber  that  best  described  his  position 
between  these  poUr  adjectives.  Twenty-si*  of  these  fifty  pairs 
produoed  significant  correlations  with  the  oriterion  for  at  least 
one  of  the  aasessee  groups.  The  pairs  of  adjectives  and  their 
correlations  with  the  criterion  for  eaoh  assesses  group  are 
presented  in  Table  2.  Positive  correlations  indleate  that  persons 
who  rated  theaselves  higher  than  average  on  the  rlghtaiost 
adjective  were  sore  apt  to  be  rated  high  on  field  leaderahip. 
Negative  correlations  indicate  that  persons  who  rated  theaselves 
higher  than  average  on  the  leftaost  adjeotive  were  aore  apt  to  be 
rated  high  on  field  leadership.  A  negative  correlation  does  not 
necessarily  aean  that  people  were  oloser  to  the  "1"  end  of  the 
scale  than  to  the  end  of  the  scale.  It  only  indicates  that 
persons  who  were  on  the  "1"  side  of  the  overall  average  for  that 
itea  were  aore  apt  to  be  rated  high  on  the  criterion. 

COMPARISON  u r  DIFFERENT  CLASSES  OP  ACTA  SCORES 

Table  3  presents  suaaary  data  for  all  assessee  groups  for  the 
si*  classes  of  ACTR  scores.  It  can  be  seen  that  the  nuaber  of 
scores  per  aasessee  (Coluan  1)  varied  froa  9  for  the  Pencil  and 
Paper  Perforeanoe  Testa  to  75  for  the  Self-Description 
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Instruments.  The  assessor  time  per  score  (Column  A)  showed  •  eery 
wide  variation  fro*  10.9  minutes  per  score  for  Assessor  Ratings  on 
foraal  Exercises  to  less  than  one  alnute  per  soore  for  the  Self- 
Description  Instruments.  The  latter  aaall  tiae  per  score  reflects 
the  assessor  tiae  savings  that  resulted  froa  presenting  the  Self- 
Description  Instruaents  in  a  group  (six  assessees)  setting.  The 
sero  "assessor  tiaes  per  soore"  that  appear  for  Peer  Rankings  and 
Self  Rankings  refleot  the  faot  that  these  scores  were  provided  by 
the  assessees  and  did  not  require  any  additional  tiae  of  assessors 
beyond  that  req*\ired  for  the  assessor  ratings  on  these  exerolses. 
The  "assesses  tia«  per  soore"  (Column  6)  is  prorated  over  Assessor 
Ratings,  Peer  Rankings  and  Self  Rankings.  Thus  only  a  single 
figure  is  shown  for  this  ocluan  for  these  three  categories.  It 
can  be  seen  that  essassee  tiae  per  soore  is  also  long  for  the 
Formal  ACTR  Bxeroiwa.  Assesses  tiae  par  soore  is  longest  for  the 
Penoil  and  Paper  Performance  Tests  and  shortest  for  the  Self- 
Description  Instruaents. 

A  successful  prediotcr  is  defined  in  this  report  as  one  which 
has  a  correlation  with  the  criterion  that  is  significant  at  the 
.05  level.  I»,  Column  2  of  Table  3  the  average  number  of 
successful  predictors  per  assesses  is  given  and  Column  3  shows  the 
percentage  that  this  is  of  the  total  nuaber  of  soores  for  the 
assesses.  Five  percent  would  be  expected  by  ohanoe  due  to  the  .05 
significance  level.  This  figure  ranges  froa  a  h?gh  of  16.71  for 
the  Paper  and  pencil  Performance  Tests  to  near  oher?oe  levels 
(6.71)  for  the  Peer  Rankings.  The  high  figure  for  the  Penoil  and 
Paper  Performance  Tests  is  somewhat  misleading  since  all  of  the 
significant  prediotors  were  for  the  ANCOES  group  and  all  indicated 
poor  penoil  and  paper  test  performance  to  be  related  to  good  field 
ratings  (see  below).  Perhaps  the  most  interesting  data  is  in 
Column  5  where  the  assessor  time  per  suooessful  predictor  for  each 
class  of  ACTR  soore  is  shown.  This  ranges  froa  2  minutes  per 
successful  predictor  for  the  Self-Description  Instruments  to 
nearly  two  hours  per  such  predictor  for  the  Assersor  Ratings  of 
Formal  Exercises. 

The  assessor  ratings  of  foraal  exercises  represent  the  aost 
typical  ACTR  data  and  their  collection  is  the  raison  d*etre  of  an 
asswsaaent  center.  The  poor  predictions  froa  these  rating  scores 
compared  to  interviews,  and  to  questionnaires  is  thus  especially 
disappointing  for  ACTR  proponejts.  The  poor  performance  is  not  a 
result  of  low  rating  reliability.  Cbeoks  of  rater  reliability  on 
the  exercises  where  sore  tit, an  one  assessor  rated  the  saae  assessee 
indicated  that  reliability  of  the  ratings  was  surprisingly  good. 
Spearaon-Brown  calculations  Indicate  the  three-rater  sums  for  LCD, 
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ALOE,  CONG  and  LGAM  to  have  reliabilities  in  the  70s  and  80s. 

The  high  reliability  o/  the  criterion  field  leadership 
ratings  was  described  earlier.  Sinoe  both  criterion  and  assessor 
ratings  are  reliable,  the  failure  of  the  assessor  ratings  to 
provide  sore  than  a  few  significant  correlations  with  the 
criterion  oust  reflect  soae  failure  of  the  ACTE  exercises  to 
elioit  and/or  aeasure  the  same  behaviors  that  pears,  superiors, 
and  subordinates  in  field  units  olasslfy  as  "leadership*. 

Tables  A,  5,  6  and  7  provide  the  data  of  Table  3  with  a 
separate  breakdown  by  the  different  assessee  groups.  It  can  be 
seen  that  the  AMCOES  scores  (Table  7)  provide  auoh  better 
prediction  of  the  criterion  than  the  ACTE  scores  of  any  of  the 
other  assessee  groups.  However,  a  sixable  portion  of  the 
significant  AMCOES  criterion  predlotors  represent  a  trouble soae 
inverse  relation  between  ACTE  perforaance  and  the  criterion.  One 
noraally  would  not  intentionally  set  up  an  ACTE  with  the  intent  of 
selecting  for  proaotion  or  asployaent  only  those  persons  who  do 
bsdiy  on  the  ACTE  tasks.  These  inverse  relationships  between 
prediotor  and  criterion  refleot  a  failure  of  the  ACTE  exercises, 
the  unsuitability  of  the  criterion,  or  both  -  at  least  for  the 
ANCOES  group. 

Another  result  that  is  apparent  from  Tables  A,  5,  6  and  7  is 
that  different  assessee  groups  often  have  different  patterns  of 
success  for  the  different  olasses  of  ACTE  scores.  Per  example, 
the  Entry  Interview  does  an  excellent  Job  for  the  BIGCC  group  (361 
successful  predictors)  but  it  does  little  predicting  for  sny  other 
group.  For  IOAC  assessees,  the  Self-Desoription  Instruments  do  a 
good  Job  of  predicting  the  criterion  but  the  other  olasses  of 
score  have  little  predictive  validity. 

Table  8  represents  a  breakdown  of  the  data  in  Table  3  by 
separate  exercise.  The  aost  effective  single  aeasure  by  almost 
ail  criteria  is  the  Person  Description  Blank.  This  instrument 
required  less  than  ton  minutes  to  administer  but  provides  much 
more  effective  criterion  prediction  than  exercises  such  as  the 
Eadio  Simulate  which  required  five  hours  of  assessee  time,  and 
even  more  assessor  time.  However,  it  can  be  argued  that  self* 
descriptions  would  be  suoh  lees  effective  in  a  setting  where 
deliberately  falsified  self-descriptions  might  occur.  False  self¬ 
descriptions  would  have  been  at  a  minimus  in  the  USA13  ACTE  sinoe 
the  sasessees  were  assured  tnat  the  data  would  not  affeot  their 
careers. 


388 


RESULTS  fOR  SIX  Dim8£XT  CUSSES  OF  ACT*  SCOW:  IOSC  ASSESSES* 
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T«bl«  R  (cont'd) 

RESULTS  FOR  SEPARATE  ACT*  EXERCISES  FOR  ALL  ASSESSES  CROUPS 
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DISCUSSION 


Two  perspectives  exist  for  discussion  of  these  results.  One 
is  in  tens  of  the  speoifio  characteristics  as  aeasured  in  the 
ACTR  which  predict  field  leadsrshlp  ratings  of  the  different 
assessee  groups.  The  other  perspective  for  viewing  these  results 
is  in  terms  of  the  general  question  of  what  parts  of  the  ACTR  were 
effective  in  assessment  of  leadership. 

CHARACTERISTICS  OF  SPECIFIC  ASSESSEE  GROUPS 

The  young  lieutenant  who  had  recently  been  commissioned  and 
who,  following  nis  Infantry  course  was  rated  high  on  leadership  by 
peers,  subordinates  and  superiors,  judged  himself  to  be  more  wary, 
competitive,  soothing  and  leading.  His  decision-making  skills 
were  rated  lower  by  uiaself  and  by  trained  assessors.  Ironically, 
he  was  Judged  to  be  somewhat  lower  on  self-development  than  the 
lieutenant  who  was  rated  uore  poorly  on  field  leadership. 

The  captain  who  was  about  to  enter  the  Advanced  Infantry 
Course  and  who  later  received  high  ratings  on  the  field  leadership 
criterion  was  apt  to  be  high  on  his  need  for  order  and  more  apt  to 
prefer  a  structured  work  environment.  He  performed  well  on  in- 
basket  exercijes  and  viewed  himself  as  more  hard-working,  wary, 
Interesting,  tough,  ambitious,  active,  secretive,  and  stable. 

The  erlisted  nan  about  to  enter  Orfioer  Candidate  Sohool  and 
who,  following  his  OCS  training  and  Branch  leadership  course,  was 
rated  high  on  field  leadership,  was  more  apt  than  his  low-rated 
colleague  to  make  a  good  impression  and  to  be  fluent,  creative  and 
task-oriented .  He  viewed  himself  as  sore  creative  and  persistent, 
yet  somewhat  less  dominating  and  less  sincere  than  his  oolleague 
who  fared  less  wall  on  field  leadership  ratings. 

The  NCO  about  to  enter  the  advanced  NCC  course  who  later 
receives  high  ratings  of  field  leadership  was  more  enthusiastic 
bat  poorer  in  reading,  quantitative  and  verbal  skills  than  his 
oolleague  who  reoeived  lower  field  leadership  ratings.  He  was 
more  considerate,  but  less  able  to  perform  on  in-basket  exercises 
and  in  simulated  emergencies.  He  viewed  himself  as  more  athletlo, 
firm,  oareful,  soothing  and  brave  then  did  his  low-rated 
oolleague. 


PREDICTIVE  VALIDITY  OF  DIFFERENT  CUSSES  OF  ACTR  SCORES 

Self-Description  Instruments  provided  the  largest  proportion 
of  criterion  predictors  and  also  provided  these  soores  with  the 
least  assessor  and  assesses  ti^e.  On  th*  other  band,  the  most 
assessor-intensive  formal  ACTR  exeroisss  aotually  do  the  poorest 
job  of  predicting  the  field  leadership  criterion.  Interaeoiate 
between  these  extreaeo  is  the  Entry  Interview  whloh  provided  a 
fair  number  of  predictors  with  only  a  moderate  amount  of  assessor 
and  assesses  time. 

These  results  must  be  somewhat  distressing  to  proponents  of 
the  assessment  center  oonoept.  Such  formal  exercises  as  the  In- 
Basket,  Assigned  Leader  Group  Exerolse  and  Ltaderless  Grcap 
Discussion  are  the  backbone  of  suoh  centers.  For  suoh  exeroisss 
to  predict  poorly  in  the  current  setting,  despite  good  to 
excellent  reliability  of  prediotor  and  criterion  measures, 
indicates  a  mismatch  between  the  ACTR  exerolse  measures  and  the 
criterion  soores.  A  possible  explanation  of  this  mismatch  is  that 
the  ACTR  was  more  effective  in  eliciting  leadership  skills  than 
the  subsequent  duties  of  these  leaders.  The  USA1S  ACTR  exeroisss 
probably  did  provide  tough  challenges  to  leadership  and  aotual 
assesses  leadership  skills  were  probably  demonstrated  for 
assessors  to  rate.  However,  the  criterion  ratings  were  made 
during  peacetime  when  few  if  any  emergencies  would  arise  wbieh 
required  excellent  leadership  for  their  sucoessful  resolution.  As 
a  result,  the  criterion  ratings  say  have  been  made  on  some  other 
factor  than  leadership  suoh  as  sociability.  Another  possible 
basis  for  field  ratings  sa.  have  been  the  leadership  self- 
conoeptlona  that  the  assessees  held  and  somehow  oommunioated  to 
the  superiors,  peers,  and  subordinates  who  provided  the  criterion 
ratings.  With  few  if  any  opportunities  for  assessees  to 
demonstrate  genuine  leadership,  this  "talk  about  leadership"  aey 
have  teen  the  basis  for  leadership  ratings.  Not  only  would  this 
account  for  the  general  failure  of  assessor- intensive  exercises  to 
predict  the  criterion,  it  would  explain  the  relative  success  of 
Instruments  such  as  the  Person  Description  Blank  which  were 
specifically  designed  to  obtain  leadership-related  self 
conceptions. 

Future  validation  studies  planned  for  the  USAIS  ACTP 
assessees  will  utilise  promotion  date  as  a  leadership  criterion. 
Hopefully,  promotions  of  these  leaders  would  be  related  to  their 
actual  leadership  skills  and  not  to  sociability  or  to  their 
inoorrect  self-perceptions  of  their  leadership  skills. 
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ASSIGNMENT  PROCEDURES  IN  THE  AIR  FORCE 
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Brooks  AFB,  Texas 


I.  INTRODUCTION 

In  July  1973,  personnel  from  the  Air  Force  Recruiting  Service  and 
Air  Force  Human  Resources  Laboratory  (AFHRL)  discussed  strategies  for 
examining  the  feasibility  of  a  computer-based  enlistment  reservation 
system  to  enhance  the  existing  Air  Force  Procurement  Management  Information 
System  (PROMIS).  A  small  computer-based  job  reservation  system  was 
developed  using  System  2000  data  management  system  to  demonstrate  to 
recruiting  service  personnel  the  feasibility  of  on-line  job  reservations 
(Ward  and  Haltman,  19?5).  This  demonstration,  In  September  1973,  resulted 
In  the  development  by  Air  Force  Military  Personnel  Center,  Recruiting 
Service  and  AFHRL  of  an  operational  job-reservation  system  (Pina  and 
Stifle,  1977).  The  system  became  operational  1  November  1976,  with  Air 
Force  representatives  at  the  sixty-six  Armed  Forces  Examining  and 
Entrance  Stations  (AFEES)  Inquiring  through  remote  terminals  to  a 
Burroughs  6700  computer  located  at  Randolph  AFB,  Texas. 

This  paper  discusses:  (1)  designing  personnel  systems  for  acceptance 
and  improvement,  (2)  a  general  framework  for  viewing  personnel  assignment 
systems,  (3)  the  procedure  for  offering  jobs  in  the  PR0HI5  system 

II.  DESIGNING  PERSONNEL  SYSTEMS  FOR 

ACCEPTANCE,  EVOLUTIONARY  IMPROVEMENT, 

AND  TECHNOLOGY  TRANSFER 

A  Personnel  System  may  be  viewed  as  a  vehicle  to  aid  In  improving 
the  effectiveness  of  an  organization.  To  be  useful,  a  Personnel  System 
should  be  designed  for: 
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DESIGNING  PERSONNEL  SYSTEMS 
TOR  ACCEPTANCE  AND  IMPFoVEMENT 


•  ACCEPTANCE  BY  MANAGERS  ANO  MEMBERS  OF  THE  ORGANIZATSON 

t  EVOtUNTIONARY  (INCREMENTAU  ADJUSTMENTS  LEADING  TO 
CONTINUED  IMPROVEMENT 

•  EASE  OF  INCORPORAT  INC  NEW  HUMAN  RESOURCES  RESEARCH 
FINDINGS  INTO  THE  OPERATIONAL  PERSONNEL  SYSTEM 


Acceptance 

If  a  personnel  system  Is  to  have  an  opportunity  to  help  an  organ¬ 
ization,  It  must  continue  to  exist.  In  order  to  exist.  It  must  be 
acceptable  to  managers  and  members  of  the  organization.  Designers  of  a 
personnel  system  must  plan  for  Initial  and  continued  acceptance  by 
members  of  the  organization. 

Evoluntlonary  Improvement 

Designers  of  a  personnel  system  must  allow  for  future  changes— both 
expected  and  unexpected.  The  system  should  expect  those  future  policy 
changes  designed  to  Improve  personnel  effectiveness.  However,  It  Is 
Impossible  to  foresee  the  problems  that  can  arise  after  operational 
Implementation.  The  design  features  of  the  system  that  allow  for  change 
also  help  insure  continued  acceptance.  The  capability  to  change  must  be 
approached  with  caution,  since  too  frequent  or  too  much  change  might 
lead  to  non-acceptance  and  destruction  of  the  personnel  system. 

Incorporating  Hew  Research 

In  addition  to  allowance  for  expected  management  changes  arJ 
unexr  :cted  problems,  It  Is  highly  desirable  to  design  a  personnel  system 
for  acceptance  of  new  human  resources  research  findings.  Some  new 
technologies  may  require  major  modifications  to  the  system.  However, 
many  future  Improvements  can  be  Incorporated  easily  Into  the  operational 
system  If  It  contains  a  technology  transfer  capability. 

III.  A  VIEW  OF  PERSON-JOe  ASSIGNMENTS 

This  section  presents  a  view  of  person- job  assignments  that  allows 
for  ur.er  acceptance,  evoluntlonary  improvement,  and  transfer  of  new 
research  findings.  The  concepts  to  be  described  emphasize  Information 
about  jobs  and  people,  pay-off  or  utility  of  particular  person- job 
assignments,  and  the  contribution  of  each  partlrular  assignment  to 
overall  system  effectiveness.  Before  examining  the  details.  It  is 
helpful  to  look  at  the  Mllitiry  Career  Life  Cycle, 
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MILITARY  CAREER  UFE  CYCLE 


This  picture  represents  some  of  the  personnel  decision  activities 
that  take  place  during  a  military  career.  The  objective  Is  for  persons 
to  move  through  various  job  or  training  activities  so  that  overall 
system  effectiveness  Is  maximized.  The  following  Ideas  reflect  some 
essential  feature*  of  a  personnel  system  designed  for  acceptance  and 
Improvement. 

Activities  to  be  Accomplished  U'ob  and  Training  Requirements) 

A  necessary  first  stop  Is  the  determination  of  the  kinds  of  activ¬ 
ities  (jobs  or  training)  that  must  be  performed  In  the  Air  Force. 

This  will  bv  done  from  Information  ibout  training  requirements*  job 
requirements,  occupational  surveys,  and  other  sources.  The  attributes 
associated  with  jobs  (or  training  positions)  will  be  called  job  properties. 
Figure  1,  the  JOB  PROPERTIES  ARRAY,  represents  the  relevant  joET-attrlbute 
information  that  is  used  In  the  personnel  assignment  system.  The  word 
JOB  refers  to  any  descriptive  state  of  being  that  is  occupied  by  or  Is 
potentially  occupied  bjr  a  person.  The  general  term  "jobs1*  can  i ncl u3e 
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«11  Air  Force  jobs,  plus  activities  that  might  be  termed  "trdnlng 
jobs."  Another  Important  "job"  concept  Is  the  last  one  shown  In  Figure 
l--called  an  External  Job.  This  category  provides  for  a  job  outside  the 
particular  sub-syste»  of  Interest.  The  Inclusion  of  an  External  Job 
allows  for  rejecting  personnel  by  assignment  to  a  "job"  outside  the 
system.  In  the  Advanced  Personnel  Data  System,  Procurement  Management 
Information  System  (APDS-PR0H1S)  each  applicant  occupies  an  External  Job 
prior  to  arslgnment  to  an  Air  Force  job. 


figure  1 

JOI  PROPLST ICS  ARRAY 
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Relevant  Joh-Att  ribute  Information 

•  Tasks  to  be  performed 

•  Relative  Difficulty 

•  Aptitude  Required 
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-  Experience  Require) 

*  Training  Required 

*  Ceoqrathkel  Location 

*  Physical  Characteristics  Required 
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e 

e 

titer  nel 
J<0 

Personnel  Required  to  Accomplish  the  Activities 

After  the  jobs  have  been  determined  It  Is  necessary  to  Identify  the 
personnel  that  are  available  or  potentially  available  to  accomplish  the 
activities  required  to  operate  the  Air  Force.  The  attributes  associated 
with  persons  will  be  called  person  characteristics.  Figure  2,  PERSON 
CHARACTERISTICS  ARRAY,  represents  the  relevant  person-attribute  Informa¬ 
tion  that  Is  used  in  the  personnel  assignment  system.  The  word  PERSON 


400 


refers  to  any  Individual  that  is  a  member  of  the  Air  Force  or  is  a 
potential  member  of  the  Air  Force. 


The  last  person  indicated  In  Figure  2  is  called  t  Shadow  Person. 

This  Shadow  Person  provides  for  an  imaginary  person  to  be  considered  for 
assignment.  The  inclusion  of  this  Shadow  Person  allows  for  consideration 
of  Air  Force  jobs  that  are  unfilled.  The  consequences  of  unfilled  jobs 
(represented  by  assigning  Shadow  Persons)  is  Important  In  the  APDS-PROMIS 
System. 

Fi$ur»  2 

PERSON  CHARACTERISTICS  ARRAY 


Pay-offs  Associated  with  Personnel  Assignments 

Next,  it  is  necessary  to  determine  some  indication  of  effectiveness 
or  pay-offs  to  the  Air  Force  of  assigning  a  particular  person  to  a 
particular  job.  It  Is  desired  to  find  a  way  to  combine  different 
information  related  to  pay-off  or  value  Into  a  single  composite  Indicator. 
Information  from  management  policy,  from  operations  analysis  studies, 
and  human  resources  research  must  be  combined  to  yield  an  indicator  of 


pay-off.  The  attempt  to  obtain  such  pay-off  measures  will  be  done 
through  Policy  Development  procedures  (Hard,  1977).  Policy  Development 
Includes  the  combination  of  Policy  Capturing  and  foil cy  Specifying.  For 
Policy  Capturing,  a  group  of  policy  makers  are  presented  performance- 
related  Information  (technical  school  grades,  job  performance  reports, 
or  predictions  of  these  variables,  etc.)  about  a  sample  of  persons  and 
jobs.  The  judges  (policy  makers)  will  be  asked  to  state  the  "pay-off" 
to  the  Air  Force  of  this  sample  n f  persons  associated  with  these  parti¬ 
cular  jobs.  Then,  a  computer  will  attempt  to  capture  the  policy  of  the 
judges  by  developing  a  mathematical  model  for  predicting  the  judged 
values  from  the  person  and  job  Information. 

In  Policy  Specifying,  managers  express  their  "pay-off" *to  the  Air 
Force  of  various  person- job  combinations  through  statements  about 
general  constraints  that  the  mathematical  model  should  have.  When  these 
constraints  are  imposed,  a  model  evolves  which  will  produce  pay-off 
values  consistent  with  the  specified  policy  guidelines. 

When  appropriate,  Policy-Specifying  and  Policy-Capturing  can  be 
combined  ty  yield  a  mathematical  model  for  estimating  the  value  to  the 
Air  Force  of  any  person  for  any  Air  Force  job. 


Figure  3.  PREDICTED  PAY-OFF  ARRAY,  represents  the  pay-off  values 
estimated  from  the  mathematical  model  using  the  person- job  Information. 
The  pay-offs  associate^  with  the  Shadow  Person  (last  row)  reflect  the 
values  to  the  Air  Force  (possibly  negative  values)  of  not  filling 
various  jobs.  The  pay-offs  associated  with  the  External  Job  (last 
coluut)  reflect  the  values  to  the  Air  Force  (possibly  negative  values) 
or  not  assigning  each  person  to  an  Air  Force  job.  In  APDS-PROHIS,  each 
applicant  is  already  In  an  External  Job  and  some  applicants  are  not 
accepted  Into  Air  Force  assignments. 
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PREDICTS  PAYOFF  ARRAY 
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PERSONS 
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Allocation  of  Personnel  for  Overall  Air  Force  Effectiveness 


After  the  elements  of  the  PREDICTED  PAY-OFF  ARRAY  are  available.  It 
is  necessary  to  allocate  persons  to  jobs  in  a  way  that  will  tend  to 
Maximize  overall  Air  Force  effectiveness.  The  allocation  process  may 
not  always  assign  a  person  to  the  job  for  which  he  has  the  highest  pay¬ 
off  to  the  Air  Force  since  many  persons  must  be  considered  for  the  job. 
The  attempt  is  to  make  assignments  that  will  tend  to  maximize  overall 
Air  Force  effectiveness.  Figure  4,  ALLOCATION  ARRAY,  contains  allocation 
Indicators  and  represents  the  Information  that  reflects  the  desirability 
for  overall  Air  Force  effectiveness  of  assigning  particular- persons  to 
particular  jobs.  This  information  can  reflect  the  results  of  an  optimal 
allocation  algorithm  when  appropriate  (e.g.»  Langley's  Primal  Algorithm 
(Langley,  Xennlngton,  Shetty,  1974)).  In  this  case,  the  elements  of  the 
ALLOCATION  ARRAY  will  contain  values  of  1  where  the  assignments  result 
In  the  maximum  overall  pay-off  and  0  for  the  non-optimum  assignments. 

The  ALLOCATION  ARRAY  may  also  reflect  a  wide  range  of  numerical 
values  (e.g.,  Ward's  Decision  Index  (Ward,  1959))  that  when  used  as  a 
basis  of  assignment  will  tend  toward  maximum  overall  Air  Force  effective¬ 
ness.  This  approach  Is  appropriate  when  a  sequential -constrained-choice 
assignment  Is  desired  (such  as  In  APDS-PROHIS),  the  problem  Is  too  large 
for  optimum  solution,  or  some  of  the  data  required  for  optimum  solution 
is  not  available  (Ward  and  Davis,  1963).  Both  optimum  allocation 
algorithms  (for  batch  assignments)  and  near-optimum  procedures  (for 
sequential -constrained-choice)  should  be  available  in  a  personnel  system 
and  used  as  appropriate. 


PERSONS 
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EXAMPLE  OF  PREDICTED  PAY-OFF  ARRAY  AND  ALLOCATION  ARRAY,  Figure  5 
Illustrates  the  difference  between  the  PREDICTED  PAY-OFF  ARRAY  and  the 
ALLOCATION  ARRAY.  The  elements  of  the  allocation  array  reflect  that 
assignment  of  person  1  to  job  3  (allocation  Index  *  14.0)*  person  2  to 
job  1  (allocation  Index  ■  14.0),  and  person  3  to  job  2  (allocation  Index 
■  13.5)  will  maximize  the  sun  of  pay-off  values  (6  +  5*4*  15).  It  is 
Interesting  to  observe  that  an  optimum  allocation  algorithm  would  produce 
an  allocation  array  with  values  of  1  In  the  place  of  the  Index  values 
14.0  (Person  1,  Job  3),  14.0  (Person  2,  Job  1),  13.5  (Person  3,  Job  2) 
to  reflect  the  optimum  assignments  and  0  In  the  other  6  locations. 
However,  the  values  that  are  now  In  the  array  provide  for  alternative 
assignments  that  maintain  near  optimality.  This  Is  operationally 
Important  In  a  system  that  provides  for  choice  In  either  a  sequential  or 
batch  assignment  system.  A  person  can  be  allowed  to  choose  from  jobs 
which  have  high  allocation  Index  values  and  thereby  maintain  high  overall 
Air  Fo**cq  effectiveness.  For  example.  If  person  number  1  were  allowed 
to  choose  either  job  2  or  3  -  and  he  chose  job  2  (second  highest  alloca¬ 
tion  Index)  then  a  pay-off  sum  of  13  would  be  possible.  (Either  7  ♦  5  + 

1  •  13  or  7  +  0  +  6  »  13). 
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EXAMPLE  Of  PREDICTED  PAYOFF  ARRAY 
AND  ALLOCATION  ARRAY 

PREDICTED  PAYOFF  ARRAY  ALLOCATION  ARRAY 

JOIS  JO|S 


Job  1  Job?  Job) 


Job  i  Job?  Job) 
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ID 

The  higher  numbers  in  the  A I  location  Array  Nfled  the  desirability  o>  assignments 

ter  overall  effectiveness  of  the  Air  force 

I  Overall  Effectiveness  •dD  +  4'D) 

When  Highest 
Allocation  Indents 
Are  Used 


$ unwary  of  the  Personnel  Assignment  System 

Figure  6  summarizes  the  basic  features  of  the  personnel  assignment 
system.  Information  about  jobs  (Figure  1)  and  people  (Figure  2)  are 
mixed  to  generate  a  pay-off  (or  value)  of  each  potential  person- job 
asslgrmeni  (Figure  3).  From  the  pay-off  array  an  allocation  array 
(Figure  4)  Is  produced  to  Indicate  the  appropriateness  of  each  potential 
assignment  for  overall  Air  Force  effectiveness. 
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SUMMARY  Of  THE  PERSONNEL  ASSIGNMENT  SYSTEM 


Figur*  4 


*  Decision  Into  * 

*  Transportation  Algorithm  * 

......  j.  •••••• 

ALLOCATION 

ARRAY 


While  Figure  6  summarizes  the  personnel  assignment  system  which 
considers  personnel  and  jobs  as  they  exist  -  Figure  7  represents  the 
modification  of  job  properties  and  modification  of  person  characteris¬ 
tics  so  that  the  pay-off  array  can  be  improved.  Continued  personnel 
training,  occupational  re-design  and  organizational  Improvement  can 
bring  about  desired  changes  In  personnel  and  jobs. 
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Figure  7 


A  VIEW  OF  PERSONNEL  ASSIGNMENTS 
INCljjyfNfe  jjgjffijg  ANtuM  M6bmtATION 


ebove ***1*"»»‘  system  described 
result  In  C£S\ fcTR.'jfc* “SSSS^'"^ 
describes  the  eppllc.tlen  of  32?, 'SS*  »TO-P^!^  ,*Ct(°" 

Iv-  JJVWCEO  PERSONNEL  DATA  SYSTEM 
^IJREMENT  MANAGEMENT  INFORMATION 
SYSTEM  (APDS-PROMIS) 

PROHlIhfhS^  btl«.'"CrUUI"9  Ser,k'  ,,sU  th*  C-Mcterlstlcs  of  tfOS- 
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WHAT  IS  APDS-PROMIS?  j 


•  Reel-time  computer  system  to  reploce  telephone  link 

•  Job  counselling  trensfc  rred  to  AFttS  processing  teem 

•  Computerized  preenlistment  job  destination  tp/j  match) 

•  Recruiting  objectives  ter  ?B  days 

•  Improved  requirement  accounting 
e  Seduced  manual  reporting 

e  More  professional  recruiting  101091 


The  following  special  features  were  considered  In  the  design  of  the 
system: 


SPECIAL  FEATURES  OF 

PERSON-JO*  MATCH  FOR  PROMIS  ENHANCEMENT 


e  Sequential  consideration  of  parsons  to  be  assignad 

•  Future  accessions  art  unknown 

•  list  of  cpportunHias  must  bt  provkfod 

•  Opportunities  must  ba  immediately  avallabl# 

SPECIAL  FEATURES 
FOR  ACCEPTANCE  AND  MAINTENANCE 


a  Payoff  functions  aasy  to  (Safina  and  modify 
a  Eilacts  of  modifications  are  easily  visible  on  opportunities  list 
♦  a  Provide  capability  through  which  human  resources  research 
findings  can  affect  and  improve  individual  personnel  assignments 

Opportunity 

The  major  component  of  PROMIS  Is  the  OPPORTUNITY  command.  The 
following  events  provide  the  ordered  11st  of  jobs  from  which  an  applicant 
may  choose: 

OPPORTUNITY 
Parson/Job  Match 

a  Input  applicant  aptitude,  physical  4  preference  data 
a  Test  qualification  far  jobs 
a  Test  availability  of  jobs 

•  Computa  “worth*  Impropriatenesst  value  for  each  job 
t  Maximize  tote)  worth  to  Air  Force  and  individual 
e  Provide  list  of  most  appropriate  jobs 
e  CTEP 

a  Open  enlistment 
e  Offer  opion  to  reserve  job  from  list 
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Predicted  Pay-off  Values.  As  Indicated  above  an  essential  step  is  the 
creation  of  a  nay-off  array.  There  are  five  components  contributing  to 
the  pay-off  values. 

CREATING  PREDICTED  PAY-OFF 
.  OF  A  PERSON-JOB  COMBINATION 

f  USING 

POtICV  SPECIFYING 

* 

*  •  Person -Aptitude  and  Jo6  Difficulty 
(Th*  A-D  CompontnO 

•  Technical  training  success 
a  Aptitude  «rti  preferences 

•  Rata  of  Job  fill 

a  Minority  joe  till 


Aptitude  Potential  and  Job  Difficulty.  Research  findings  and  experienced 
personnel  people  have  indicated  that  Interacting  a  person's  aptitude 
with  the  job's  aptitude  requirements  so  that  the  most  talented  people 
are  assigned  to  the  most  demanding  jobs  will  reduce  training  costs. 
Increase  job  satisfaction  and  productivity,  and  Improve  personnel 
retalnablllty.  This  concept  has  been  Implemented  through  the  A-D 
(Aptitude-Difficulty)  component. 

APTITUDE  POTENTIAL  AND  JOB  DIFFICULT'* 


Y  •  KA,  D> 
where 

A  •  Aptitude  lor  particular  jot 
D  *  Relative  difficulty  of  particular  jo& 


A  three-dimensional  view  of  this  component  is  shown  in  the  following 
figure. 
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FAfOM  VALUt 


*00 


Pay-off  Function  of  Aptitude  and  Difficulty 


This  figure  Indicates  that  for  a  low  difficulty  job— for  example,  C 
»  40— there  is  a  slight  Increase  In  pxy-off  as  aptitude  Increases; 
however,  for  a  higher  difficulty  job— for  example  D  •  60— the  increase 
in  pay-off  is  more  rapid.  Also,  notice  that  for  a  low  aptitude  person 
--for  example  Aptitude  •  4C— the  highest  pay-off  Is  on  a  low  difficulty 
job,  with  the  pay-off  decreasing  rapidly  as  difficulty  increases.  And 
for  higher  aptitude  persons  the  best  pay-off  is  on  higher  difficulty 
jobs.  A  person  will  have  maximim  pay-off  when  his  aptitude  closely 
matches  the  job  requirements.  And  higher  aptitudes  matched  to  more 
difficult  jobs  are  more  valuable  than  lower  aptitudes  matched  to  less 
demanding  jobs. 

At  the  present  ;ime,  only  that  part  uf  the  function  to  the  left  (or 
higher  side)  of  the  ridge  Is  getting  any  use  because  o*1rf  ng  ineligi¬ 
bility  rules  do  not  allow  applicants  who  have  aptitudes  be'iow  a  certain 
cut-off  to  be  considered  for  a  job  -i.e.,  the  worth  below  the  cut-off  is 
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negative  Infinity!  However,  if  policy  makers  allow  applicants  to  become 
eligible  for  jobs  slightly  below  existing  cut-off  scores  the  pay-t*f 
functioi.  Is  available  for  use.  Slight  lowering  of  cut-off  rules  would 
allow  greater  flexibility  for  making  personnel  assignments  which  should 
result  In  better  use  of  manpower. 

Technical  Training  Success.  The  seconu  component  Is  technical  training 
success,  this  function  Involves  predicted  technical  school  success  from 
aptitude  tests,  high  school  courses  taken,  the  particular  technical 
school,  and  high  school  graduation  status. 

TECHNICAL  TRAINING  SUCCESS 
Y  •  I  (AQC,  AFQT,  HS  courses.  Teen  Schools! 

Aptitude  Area  Reference.  Each  applicant  expresses  a  relative  preference 
weighting  for  the  four  "areas  —  Mechanical,  Administrative,  General,  and 
Electronics.  These  preferences  are  considered  In  the  pay-off  function. 

APTITUDE  AREA  PREFERENCES 


IIM,  A.  G,  E  P'Meruncesi 


•  MechjnicalAi 

•  Administrative  Al 

•  General  A  i 

•  Electronics  Al 


This  component  may  be  replaced  in  the  future  by  the  Vocational  Interest 
Career  Examination  (VOICE). 

Job  Fill  Rate.  This  dyncmlc  feedback  component  Is  of  extreme  importance 
to  recruiting  service.  It  reflects  interaction  between  the  percentage 
of  jobs  sold,  amount  of  time  since  job  Wr-.s  released,  and  a  priority 
associated  with  each  job.  As  each  job  1*i  reserved,  and  as  time  changes, 
this  component  Is  rodl fled  to  charge  <r*ph8f,is  on  jobs  that  are  ahead  or 
behind  a  desired  rate  of  fill. 


JOS  FILL  RATE 


Y  •  f  (Pjt  T,  X) 


P.  •  Percent*?*  ~1  Jobs  sold 

T  •  A.nount  ot  lime  since  job  rele*se 
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This  job  fill  rate  component  Is  In  the  process  of  being  modified  to 
reflect  the  actual  number  of  unfilled  jobs  Interacting  with  the  other 
three  job  properties  —  percentage  fill,  time,  and  priority. 

Minority  Job  fill 

This  dynamic  component  is  continuously  adjusted  to  maintain  a 
specified  minority  balance  across  jobs. 


MINORITY  J08  fill 


Y  •  HP*.  Cl 
vrturt 

Pm  •  Percentage  of  )ofo  HIM  by  minorWti 
G  *  Dajlred  minority  Job  fill  90a? 


Maximizing  Overall  Air  force  Effytiveness.  PROMIS  requires  presenta¬ 
tion  of  an  ordered  list  of  jobs  from  wKich  applicants  may  choose.  An 
Allocation  Index  is  computed  that  reflects  the  desirability  for  overall 
Air  Force  effectiveness  of  assigning  the  applicant  to  each  job  on  the 
list.  An  Allocation  Index  called  the  Optimality  Indicator  Is  used  as 
the  basis  of  ordering.  This  index  is  based  on  the  Decision  Index  (Ward, 
(1959)  )  described  above. 


ASS IGNMf NT  Of  PEUSQNNCt 
TO  MAXIMIZE  OVtftAU  AIR  fORCt  EFFECT  I VENCSS 


Decision  Index  us«J  k  tft*  altocafioo  index 
for  ordering  the  opportunities  list 


V.  PLANNED  IMPROVEMENTS  TO  PROMIS 

The  evolutionary  capability  of  the  system  allows  for  incorporating 
modifications  as  required.  Planned  Improvements  are  shown  below. 


FLAMMED  IMPROVEMENTS 


•  Modify  fill-rat*  component  to  reflect  actual  number  of 
jobs  unsold 

•  Combine  the  aptitude-difficulty  consonant  interactively 
«ith  th*  fill-rat*  component  to  reffmet  policy  in  which 
th*  importanc*  of  fl'i-rat*  is  diff*r*nt  lor  difftftttt 
lewis  of  th*  aptKude-difficutty  component 

•  Combine  attrition  prediction  information  with  training 
costs  inti,  the  pay-off  function  to  cartel  pood  risks  to 
■sore  expensive  training  and  pool  risks  to  less  expansive 
training 

•  Int'oduc*  mutts  from  th*  Vocational  Interest  Career 
Examination  (VOICE!  into  tha  pay-off  function  to 
improve  job  satisfaction  and  personnel  retainability 

•  Consider  interaction  of  th*  *>(Hud#-dlffk,utty  component 
with  th*  VL-CE  (interest)  component 


VI,  APPLICABILITY  TO  OTHER  PERSONNEL  SYSTEMS 

The  concepts  above  can  be  applied  to  ar\y  personnel  system  that 
would  like  to  match  person  characteristics  with  job  properties  and 
produce  either  an  ordered  list  of  job  opportunities  from  which  an 
applicant  siay  choose  (as  In  APDS-PROMIS)  or  an  ordered  list  of  appli¬ 
cants  from  which  a  job  manager  may  choose  (as  when  a  job  must  be  filled). 
The  airmen  post-enl istnent  assignment  system,  now  being  developed, 
should  ^e  applicable  to  a  wide  variety  of  personnel  sub-systems--a1rmen, 
officers,  and  civilians. 


APPLICABILITY  TO  OTHER  PERSONNEL  SYSTEMS 

e  Air  Force  enlisted  re-assignments 
e  Officers  assignments 
e  AF  civilians 
*  Others 
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VII.  SUMMARY  OBSERVATIONS 


A  mechanism  Is  evolving  through  which  human  resources  research 
findings  can  directly  affect  and  Improve  Individual  personnel  assign¬ 
ments.  System  flexibility  provides  for  modification  and  Introduction  of 
new  components  to  Insure  continued  acceptance  and  Improvement.  The 
approach  has  general  applicability  to  personnel  systems  that  can  Identify 
Information  about  persons  and  jobs  and  specify  a  pay-off  generating 
policy. 

Implementation  of  this  approach  has  led  to  Identification  of  areas 
of  human  resources  research  that  will  contribute  significantly  to 
improved  systems  performance. 


RESEARCH  AREAS  OF  POTENTIAL  VAtUE 


•  SEARCH  FOR  PERSON  CHARACTER  IS!  ICS  AND  JOB  PROPERTIES 
THAT  INTERACT  IN  PREDICTION  Of  PAY -Off  VALUES 

•  DEVEIOP  NEW  METHODS  FOR  SPECI'YING  THE  PAY-  Off 
VALUES  ASSOCIATED  WITH  PERSON- JOB  ASSIGNMENTS 


•  STUDY  THE  USE  Of  ALLOCATION  INDEXES  NOT  ONLY  AS  AN 
OROCRING  VALUE  FOR  OPPORTUNITY  LISTS,  BUT  AS  A 
SUPPLEJWENT  TO  APTITUDE  INDEXES  NOW  IN  USE 
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On  October  18,  1976,  the  80th  Maneuver  Training  Command  received  : 

\ 

the  requirement  to  produce  the  documents  required  to  evaluate  the  combat  ] 

readiness  of  the  11th  Special  Forces  Group- -Airborne,  a  group  targeted 
against  Eastern  Europe.  The  actual  adaptation  of  Army  Training  Evaluation 
Program  71-101  (ARTEP  31-101),  the  primary  evaluation  tool,  was  the  re¬ 
sponsibility  of  the  Infantry  Team  in  general  and  the  author  in  particular. 

The  author  realized  that  his  biases  were  based  on  his  combat  experience 
in  southeast  Asia  and  that  these  biases  had  been  reinforced  oy  the  three 
years  he  spent  In  a  Special  Forces  unit  targeted  against  Asia.  Based  on 
this  the  author  recommended  that  the  production  of  the  ARTEP  he  assigned 
*q  an  Individual  with  extensive  European  Special  Forces  experience  or 
an  attempt  he  made  at  quantifying  the  perceptions  of  those  who  had  European 
Special  Forces  experience.  The  decision  was  made  to  quantify  the  per- 

i 

ceptlons  of  those  Involved  in  the  evaluation.  This  decision  was  based  on 
the  assumption  that  underlies  decentralized  training,  l.e.,  that  the 
Commanding  Officers  will  train  their  units  in  accordance  with  their  per¬ 
ceptions  of  the  coflbat  requirements  of  the  area  they  are  targeted  against 

j 

(in  the  case  of  Special  Forces)  and  will  use  the  ARTEP  as  a  guide. 

The  ARTEP  was  dismembered  and  each  combat  requirement  that  It  con- 

i 

tained  became  part  of  a  pool  of  combat  requirements  that  were  used  for 
a  Q-sort.  The  Q-sort  was  administered  to  Special  Forces  qualified  per¬ 
sons  who  had  extensive  European  experience.  Those  specific  combat  re¬ 
quirements  of  the  ARTEP  that  were  selected  as  being  important  to/for  a 
Special  Forces  Group  targeted  against  Europe  became  the  stimuli  for  a 
pair  comparisons  test. 

The  pair  comparison  wes  administered  to  the  Deputy  Commanding 
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Officer  (DCO),  Operations  ($-3),  Intelligence  (S-2),  and  Supply  Officers 
(S-4),  plus  the  2nd  Battalion,  3rd  Battalion  and  Support  Battalion  Com¬ 
manding  Officers.  Three  Operational  Detachment  Commanding  Officers  (ODA) 
of  the  11th  Special  Forces  6 roup --Airborne  (SFGA)  were  also  included. 

The  question  then  became,  NAre  the  perceptions  as  measured  by  the 
Pair  Comparison  a  true  reflection  of  the  relative  Importance  of  the  com¬ 
bat  requirements  of  Eastern  Europe?"  The  answer  to  this  question  was 
needed  to  ensure  that  an  evaluation  stressing  the  combat  requirements  of 
greatest  relative  iMport  was  a  true  reflection  of  the  actual  combat 
requirements  of  Eastern  Europe.  The  answer  to  this  question  was  gained 
by  administering  the  Pair  Comparison  to  a  similar  population  In  the 
10th  Special  Forces,  a  highly  regarded  active  Arny  unit  targeted  against 
the  same  area.  The  administration  of  the  Pair  Comparison  to  the  10th 
Special  Forces  served  two  purposes.  First,  It  gave  an  Indicator  of  the 
accuracy  of  the  11th  SFGA's  perceptions  of  the  relative  Import  of  the 
combat  requirements  of  Eastern  Europe.  Second,  since  the  10th  Special 
Forces  was  furnishing  the  actual  on-the-ground  evaluators,  an  Indicator 
could  be  gained  as  to  whether  differences  In  evaluator-evaluatee  per¬ 
ceptions  would  be  likely  to  skew  the  evaluation. 

The  results  of  both  administrations  of  the  Pair  Comparisons  Indi¬ 
cated  that  the  focus  of  the  evaluation  and  training  should  be  one  the 
training  and  use  of  the  guerrillas.  Because  of  the  close  coordination 
required  to  develop  and  sustain  a  guerrilla  force,  the  decision  was  made 
to  develop  three  situations.  They  consisted  of: 

1.  An  operation  in  which  the  selected  operational  capabilities  of 
the  Operational  Detachment  are  measured  (ARTEP  31-101). 

2.  A  situation  that  stres:ed  the  Importance  of  the  guerrilla 
through  Intelligence  play. 
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3.  A  brief  situation  that  stressed  $-2  (Intelligence)  and  S-3 
(Operations)  coordination. 

The  use  of  guerrillas  can  be  divided  into  two  areas: 

1.  Operations  support,  which  ARTEP  31-101  encompasses. 

2.  Intelligence  support  at  both  a  tactical  and  strategic  level. 

The  lack  of  intelligence  requirements  In  ARTEP  31-101  required  the  gen¬ 
eration  of  a  situation  in  which  intelligence  reports  would  be  transmitted 
both  up  and  down  the  chain  of  command.  This  was  accomplished  by  inter¬ 
relating  intelligence  reports  from  different  levels.  An  example  follows. 

SOCGULF  (Evaluator  control  HQ) 

SFC8 

Ft.  Rucker,  AL  Special  Forces  Operations  Base  (11th  SFGA  HQ) 

Ft.  Stewart  AC8  2nd  Bn  Ft.  Bennlng,  ACB  3rd  Bn 

GA  11th  SFGA  GA  11th  SFGA 

xxxxxxxxxxxxxxxxxxxxxxxxx^xxxxxxxxxx  xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 


ODA  ODA  ODA  00A  00A  OOA  00A  00A 

SGGGGGGG  GGGGGGGG 

In  the  play  of  the  problem,  the  fact  that  the  12th  Combined  Arms  Amy 
(Soviet)  was  moving  two  tank  divisions  forward  to  reinforce  was  the  result 
of  many  Indicators  from  each  level.  That  tracked  vehicles  were  moving  for¬ 
ward  was  based  or.  Side  Looking  Airborne  Radar  (SLAR)  reports  that  were 
received  at  Special  Forces  Operations  Base  (SFOB).  That  the  tracked  vehicles 
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were  predominately  tanks  was  based  on  guerrilla  agent  reports  given  to  the 
Operational  Detachment  (00A).  T»  at  the  tanks  were  fro^  at  least  two  dif¬ 
ferent  units  was  based  on  the  correct  Interpretation  of  photos  taken  by 
the  guerrillas,  passed  to  the  Operational  Detachment,  and  exflltrated  from 
the  operational  area  and  returned  to  the  Advanced  Control  3ase  (ACB)  level 
and  eventually  to  the  Special  Forces  Operations  Base  level. 

For  the  full  story  to  emerge,  the  requirement  was  for  each  level  to 
synthesize  the  Information  It  received  and  pass  It  on  to  both  superior 
and  subordinate  headquarters.  The  Interrelating  of  Intelligence  reports 
across  levels  prevented  any  single  headquarters  from  Independently  piecing 
together  the  whole  story. 

This  technique  was  employed  to  portray: 

1.  The  forward  deployment  of  two  tank  divisions. 

2.  Coafcat  tailoring  of  enemy  units  to  counter  the  increased  Insur¬ 
gent  threat  being  developed  by  the  Operational  Detachments  as  they  trained 
the  guerrillas  (guerrillas  were  actually  high  school  ROTC  cadets). 

3.  New  troop  deployments  Indicating  a  build  up  to  halt  the  Impending 
friendly  conventional  offensive. 

Each  Operational  Detachment  received  the  same  intelligence  reports 
from  the  guerrillas.  This  allowed  the  Battalion  S-2  (Intelligence  Officer) 
to  assess  the  relative  efficiency  of  each  Operational  Detachment  to  process 
and  forward  the  information  received  from  the  guerrillas.  Whare  the 
photos  of  enemy  equipment  given  to  the  Operational  Detachment  by  the 
guerrillas  could  not  be  Interpreted  and  forwarded  as  messages,  the  Imagery 
itself  was  to  be  exfiltrated  via  the  Fulton  Recovery  system. 

Similar  procedures  were  required  of  the  Special  Forces  Operations 
Base  when  receiving  information  from  S0C6ULF  (Evaluator  Control  Head- 
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quarters) ,  Advanced  Control  Bases,  or  Operational  Detachments.  T;v„*  re¬ 
quirement  was  to  synthesize  the  Information  with  Information  already  on 
file,  draw  conclusions  and  forward  the  Intelligence  derived  by  either  radio 
message  If  it  was  of  an  imnedl ate/critical  nature  or  by  a  dally  publication 
called  an  INTSUH  (Intelligence  Summary).  Photos  that  could  not  be  inter¬ 
preted  at  a  lower  level  were  the  responsibility  of  the  Imagery  Interpre¬ 
tation  section  of  the  Group  Military  Intelligence  Detachment. 

A  third  situation  was  based  on  the  11th  SFGA  resupply  request  message 
format.  An  aerial  resupply  operation  calls  for  the  approving  authority  of 
that  mission  to  Integrate  all  Information  available  to  ensure  not  only  the 
success  of  the  mission  but  the  survival  of  the  aircraft.  This  Is  accom- 
plisned  d>  notifying  the  Air  Force  element  that  is  to  fly  the  mission  of 
any  natural  or  manmade  hazards  that  are  associated  with  a  particular  drop 
zone.  The  notification  can  take  several  forms— the  most  common  of  which 
Is  requiring  the  aircraft  to  follow  a  specific  track  or  azmuth  when 
crossing  over  the  Drop  Zone. 

The  performance  of  the  11th  SFGA  could  then  be  judged  on  three 
criteria: 

1.  The  extent  to  which  they  adequately  performed  tasks  required  In 
ARTEP  31-101. 

2.  The  extent  to  which  the  Operational  Detachments,  Advanced  Con¬ 
trol  Bases,  and  Special  Forces  Operations  Base  were  able  to  alert  SOCGULF 
to  the: 

a.  foiward  deployment  of  two  tank  divisions 

b.  combat  tailoring  to  meet  the  guerrilla  threat 

c.  new  troop  deployments. 

3.  The  ability  of  the  S  2  and  S-3  sections  to  coordinate  their 


activities. 
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The  formal  presentation  of  the  evalu  Jon  based  on  ARTEP  31-101  will 
not  be  made  until  October,  1977.  It  does  appear  that  with  the  exception 
of  comnuni cation  that  the  Operational  Detachments  did  reasonably  well  on 
the  operational  Portions  of  the  operation.  The  result  of  the  intelligence 
portion  (#2)  was  less  encouraging.  The  Operational  Detachments  failed 
to  comnunicate  the  Information  derived  from  the  guerrillas  to  the  Advanced 
Control  Bases  and  Special  Forces  Operations  Base.  This  failure  cannot  be 
blamed  on  communication  difficulties  alone.  The  Fulton  recovery  extracted 
no  intelligence  summaries,  no  raw  intelligence  reports,  no  agent  reports 
and  no  imagery.  Similarly,  there  was  no  Indication  that  Special  Forces 
Operation:  Base  or  Advanced  Control  Base  INTSUMS  were  received  by  the 
Operational  Detachments.  One  particularly  disheartening  aspect  of  the 
Intelligence  portion  of  this  operation  was  that  the  Imagery  Interpretation 
section  of  the  11th  Special  Forces'  Military  Intelligence  Detachment  was 
unable,  because  of  lack  of  phot.>  keys,  to  Interpret  over  85*  of  the  imegery 
th>t  was  administratively  provided  to  them.  This  was  the  same  Imagery 
given  to  each  Operational  Detachment  by  the  guerrillas,  all  of  which  was 
organic  to  the  Soviet  Motorized  Rifle  Division  or  Combined  Arms  /troy. 

More  disheartening  was  the  fact  that  the  Imagery  Into**pretor$  were  unable 
to  Interpret  major  end  Uems,  i.e.,  tanks,  self-propelled  artillery,  or 
to  tell  the  author  where,  by  nuntoer  and  location,  in  the  Soviet  system 
these  weapons  would  be  found. 

The  third  situation  generated  tested  the  ability  of  the  S-2  and  S-3 
sections  of  the  11th  Srecial  Forces  to  coordinate  or  tie  together  their 
activities  for  the  conduct  of  an  aerial  resupply  mission.  To  this  end 
intelligence  reports  positioning  Surface-To-Air  Missile  (SAM)  complexes 
just  out  or  range  of  several  resupply  Drop  Zones  (DZ)  was  forwarded  from 
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SOCGULF  to  Special  Forces  Operations  Base.  The  requirement  was  for  each 
message/Drop  Zone  request  sent  to  the  US  Air  Force  to  have  a  required 
magnetic  azmutk  that  would  route  the  resupply  aircraft  out  of  SAM  range 
incorporated  In  the  request.  This  would  have  required  that  the  S-2  re¬ 
ceive  the  message,  mark  on  a  map  the  positions  of  SAM  and  determine  range 
fans  for  each  SAM  complex.  Coordination  would  then  have  to  be  made  with 
the  $-3  to  ensure  that  the  Drop  Zone  request  Included  the  required  azmuth 
to  protect  the  USAF  resupply  aircraft.  None  of  the  Drop  Zone  requests 
forwarded  to  the  USAF  included  a  required  a/muth.  The  inability  of  the 
11th  SF6A  to  tie  the  operations  and  intelligence  sides  of  this  operation 
together  in  this  instance  indicates  serious  training  deficiencies  that  go 
beyond  the  scope  of  this  evaluation. 

Of  the  three  situations  generated  for  the  evaluation  of  the  11th, 
only  one  was  based  on  ARTEP  31-101.  A  conclusion  noting  that  the  11th 
SF6A  was  testbound  would,  however,  be  premature.  More  likely.  It  is  the 
carry  over  effect  of  past  Amy  Training  Tests  which  were  almost  totally 
operations  oriented. 

The  use  of  the  perceptions  cf  the  nth  SF6A  to  focus  the  evaluation 
was,  as  stated  earlier,  based  on  the  premise  that  the  commanders  will 
train  for  what  they  consider  Important.  Unfortunately,  the  results  of 
the  Pair  Comparisons  test,  when  contrasted  to  the  training  conducted  by 
the  Uth  SFGA,  highlighted  additional  problenr,.  The  Pair  Comparison  re¬ 
sults  of  both  the  10th  and  11th  Special  Forces  place  Airborne  Operations 
at  the  very  bottoa  of  the  priority  list  and  the  ability  to  effectively 
train  a  guerrilla  force  at  the  very  top  of  the  priority  list.  Yet,  during 
Annual  Training  the  11th  Special  Forces  spent  approximately  $300,000  on 
air  operations  and  none  on  language  training— a  capability  which. 
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historically,  has  dictated  the  success  or  failure  of  an  unconventional 
warfare  effort.  The  almost  perfect  inverse  relationship  between  perceived 
Importance  of  combat  requirements  and  resource  allocation  for  combat 
requirements  casts  doubt  on  the  authenticity  of  the  mission  statement  for 
Reserve  Special  Forces.  The  discrepancy  between  mission  and  resources 
allocated  by  regulation  leaves  the  command  of  the  11th  Special  Forces  in  a 
position  of  being  unable  tc  address  those  aspects  of  Special  Forces  op¬ 
erations  that  are  considered  essential  (by  bclh  the  Uth  Special  Forces 
and  10th  Special  Forces)  to  the  success  of  combat  operations. 
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COMBAT  REQUIREMENTS  AS  PERCEIVED  Bf  THE  10th 
and  11th  SPECIAL  FORCES  GROUP- AIRBORNE 


10th  SFGA 

Value 

11th  SFGA 
Value 

Stimuli 

-.17 

-.56 

1. 

Infiltrate  by  parachute  (static  line) 

-1.01 

-1 .62 

2. 

Infiltrate  by  parachute  (HALO) 

+  .32 

-.05 

3. 

Assent)! e  detachment  personnel  and 
account  for  equipment 

+  .49 

05 

4. 

Contact  resistance  force  reception 
consnl  ttee 

+  .61 

-.or 

5. 

Sterilize  infll+rstlon  site 

+  .30 

+  .08 

6. 

Begin  movement  to  designate  base 

.53 

+.48 

7. 

Begin  area  assessment  and  contact 

SFOB 

+  .28 

+  .65 

8. 

Organize  guerrilla  force  with 
command  and  staff 

+  .43 

+  .G9 

9. 

Develop  a  training  program  for  the 
guerrilla  force 

-.77 

-.05 

10. 

Select  and  report  LZs/DZs 

-.99 

-.11 

11. 

Secure,  mark  und  operate  LZs/DZs 

9.  Develop  a  training  program  that  will  enhance  the  operational  capabilities 
of  the  guerrilla  force. 

10.  Select  and  report  landing  zones  and  drop  zones  (LZs/DZs)  that  meet 
aircraft  and  resupply  requirements. 

11.  Secure,  mark  and  operate  LZs  and  DZs. 


COMPONENTS  OF  11TH  SPECIAL  FORCES  EVALUATION 

A  measure  of  selected  operational  requirements,  l.e., 
ARTEP  31-101 

A  situation  that  stressed  Intelligence  play. 

A  situation  that  stressed  Intelligence  and  Operations 
coordination 


SOCGULF  j  (Evaluator  control 


CRITERIA  FOR  EVALUATION 

1.  Performance  on  Operational  Tasks  of  ARTEP  31-101 

2.  Ability  to  alert  SOCGULF  to: 

a.  forward  deployment  of  two  tank  divisions 

b.  comuat  tailoring  to  meet  guerrilla  threat 

c.  new  troop  deployments 

3.  Ability  of  S-2  and  S-3  sections  to  coordinate  their 
activities 
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IMAGERY  INTERPRETATION  RESULTS 
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S-2/S-3  COORDINATION 


Resupply 


•Optimal  Azmuth 


#SAH  site 


maximum  allowable  deviation 
f*om  optlral 


Required  Azmuth  somewhere 


vV 

ie  re  ere  tween  max 


maximum  deviations 
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RESULTS  OF  S-2/S-3 
COORDINATION  REQUIREMENT 


Nuntoer  of  OZs 
Coordinated 

0 


Number  of  Possible  D2 
_ Coordinations 

12 


EVALUATION  CRITERIA 

?  ARTEP  31-101 
X  Intelligence  requirement 

X  Coordination  requirement 
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CRITERION-REFERENCED  SYSTEMS  APPROACH  TO  EVALUATION  OF  COMBAT  UNITS 

Angelo  Mirabella 

A ray  Research  Institute  for  the  Behavioral  and  Social  Sciences 


System  Engineering  or  training  and  its  subsidiary  criterion-refer¬ 
enced  aeacureaent  have  been  invaluable  tools  for  increasing  the  jeb- 
reaevance  of  Military  training  and  evaluation.  These  tools  have  pro¬ 
vided  an  indispensible  point  of  departure  and  a  fraaevork  fir  insuring 
accountability.  However,  they  have  been  developed  within  the  context 
of  relatively  simple,  procedural  tasks  that  are  necessary  but  not  always 
sufficient  for  describing  jobs  as  perforaed  in  working  environments. 

The  tools  work  ctssfortably  for  hard,  individual  skills.  But  the  demurrer 
often  heard  is  that  the  soft  skills  have  yet  to  be  attacked  successfully 
with  those  tools.  Those  of  us  who  have  moved  the  focus  of  our  evaluation 
research  from  the  school  setting  to  the  combat  unit  are  especially 
sensitive  to  this  demurrer  because  we  face  the  added  complication  of 
two-sided,  tactical,  collective  behavior. 

In  order  to  Improve  the  training  and  evaluation  of  such  behavior, 
the  Army  Research  Institute  for  the  Behavioral  Sciences  has  been  pur¬ 
suing  research  on  Tactical  Engagement  Simulation.  In  addition,  we  have 
been  developing  a  wupporting  system  of  evaluation.  It  is  the  evaluation 
system  research  that  I  would  like  to  talk  about,  but  for  those  of  you  not 
familiar  with  our  program  let  be  briefly  describe  the  Engagement  Simu¬ 
lation  Test  Bed. 

Engagement  Simulation  currently  is  a  set  of  techniques  of  conducting 
real-time,  two-sided  free  play,  tactical  exercises  at  the  combined  arms 
reinforced  platoon  level.  One  of  its  key  features  is  a  set  of  objective, 
casualty  assessment  methods  which  allow  almost  real-time  feedback  to  par¬ 
ticipants.  A  rifleman,  for  example,  can  fire  at  a  target,  and  register  a 
hit  b>  calling  out  a  number  on  the  helmet  of  the  opposing  infantryman. 

A  tank  gunner  can  similarly  register  a  hit  against  another  tank.  Rills 
are  relayed  via  radio  by  a  controller  to  a  net  control  station,  which  in 
turn  radios  the  target  that  it  is  out  of  action.  Suitable  pyrotechnics 
add  visual  cues  and,  therefore,  realism  to  the  battle*.  With  theae  and 
other  techniques  for  artillery  and  anti-tank  weapons,  it  is  possible  to 
measure  casualties  over  time  and  thereby  provide  for  objective  as sea ament 
of  the  outcomes  of  tactical  performance. 

Several  years  ago  when  Engagement  Simulation  was  developed  *s  a 
training  methodology,  its  developers  felt  that  the  evaluation  problem 
for  unit  training  had  been  solved.  Objective  measures  of  casualties 
were  now  available!  What  else  was  needed?  However,  an  alternative  view 
was  that  a  great  deal  else  was  needed;  that  the  Engagement  Simulation 

*TC  7 1-5.  REALTRAIN:  Tactical  Training  for  Combined  Arms  Elements .  U.S. 
Army  Armor  School/U.S.  Army  Research  Institute,  January,  1975. 
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test  bed  had  opened  up  a  Pandora's  box  vlth  respect  to  the  measurement, 
and  Interpretation  of  unit  combat  performance,  this  alternative  view 
argued  for  a  system  of  evaluation  which  included  at  least  concern  for 
process  measurements ,  and  u  scheme  for  uncovering  the  patterns  and 
relationship  among  these  two  sets  of  measures,  taken  continually  through 
i  training  exercise.  Such  a  system  included  some  other  major  features 
thich  I  would  like  to  mention. 

In  fact,  1  would  lihe  to  outline  what  an  adequate  system  of  evalua¬ 
tion  might  look  like  for  the  Engagement  Simulation  test  bed,  and  mention 
some  of  our  research  experiences  with  various  parts  of  the  System  (Fig  1) . 
If  we  were  to  proceed  logically  and  efficiently  through  the  development 
of  an  Engagement  Simulatlor  relevant  evaluation  s/eiem  we  would  begin 
with  the  development  of  a  model  or  model(s)  to  define: 

o  measurement  concepts 

o  data  processing  concepts 

o  data  interpretation  concepts 

which  are  consistent  with  the  purposes  for  measuring  and  assessing  per- 
fonsance  in  the  first  place.  It's  at  this  point  that  evaluation  alms  aud 
philosophical  biases  can  be  put  on  the  line.  If  the  major  purpose  of 
evaluation  is  diagnostic  feedback  in  support  of  a  training  system,  that 
purpose  can  be  made  explicit  and  the  rest  of  the  system  designed  accord¬ 
ingly.  This  last  statement  may  seem  obvious  and  self  evident,  but  in 
practice,  it  say  not  be  so  obvious.  One  of  the  philosophical  problems 
with  the  Army  Training  and  Evaluation  Program  (ARTEP)2,  is  that  it  does 
not  distinguish  adequately  between  evaluation  for  training  diagnosis, 
and  evaluation  for  accountability.  A  result  has  been  that  many  commanders 
regard  ARTEP  as  a  report  card  in  spite  of  TRADOC's  guidance  to  the  con¬ 
trary.  This  observation  which  came  out  of  a  current  ARI  study  suggests 
at  least  one  fundamental  problem  with  ARTEP  as  a  training  model  . 


The  next  step  in  system  development  would  be  to  define  the  data 
requirements  and  data  processing  methods  that  are  needed  to  fit  the  model 
or  model(s)  constructed  in  Step  1.  If,  in  Step  1,  for  example,  you  de¬ 
cided  that  Information  about  patterns  of  tactical  movement  is  useful  tor 
diagnostic  purposes,  that  would  suggest  a  need  to  know  what  fire  elements 
are  where,  when.  You  would  need  to  go  further  and  decide  how  much  infor¬ 
mation  on  position  location  is  needed  and  how  accurate  it  needs  to  be. 

Now  you  are  faced  with  Step  3  which  requires  that  you  define  the 
methods  for  collecting  the  data  identified  in  Step  2.  If  you  are  not  yet 
familiar  with  the  realities  of  collecting  objective  performance  data 

2 ARTEP  71-2. Army  Training  and  Evaluation  Program  for  Mechanised  Infan¬ 
try/Tank  Task  Force.  June,  1977. 

•Hluman  Sciences  Research,  Inc.  Interim  Report  (Revised).  Improved  Army 
Training  and  Evaluation  Program  (ARTEP)  Methods  for  Unit  Evaluation, 

21  October,  1977. 
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under  field  operational  conditions,  you  would  soon  learn  about  them,  at 
this  stage  of  system  development . 

Last,  but  certainly  net  least,  is  Step  k  which  is  to  define  the  per- 
formance  benchmarks  or  standards  which  make  your  system  criterion-refer¬ 
enced.  This  is  probably  the  most  difficult  of  all.  It  has  been  side¬ 
stepped  to  a  large  extent  by  ARTE?  through  the  use  of  expressions  like: 
"Casualties  shall  not  be  excessive",  with  the  definition  cf  the  benchmark 
being  left  to  the  evaluation  team.  tJk TEP  has  also  sidestepped  the  cri¬ 
terion  issue  by  using  mostly  procedural  standards,  which  are  at  a  more 
global  level  than  those  in  the  old  array  Training  Tests,  bur  which  are 
still  procedures-oased. 

I  would  like  now  to  review  some  of  the  progress  that  ARI  has  made 
in  contributing  to  such  a  system. 

Modeling.  As  part  of  a  long-term  effort  to  Validate  Engagement  Simu¬ 
lation  Training,  new  experimental  versions  of  ARTEP  are  being  produced. 
These  are  being  produced  specifically  for  some  developmental  tests  to 
be  run  et  Ft  Carson  in  January  of  ntxt  year.  Accordingly,  they  are  being 
designed  for  reinforced  platoon  missions,  i.e.,  for  tank  platoons  with 
supporting  infantry  squudo  and  tube  launched  optically-tracked  wire 
guided  missiles  (TOWS) . 

For  those  of  you  not  familiar  with  ARTF.Ps  let  me  show  you  a  typical 
page,  this  one  from  ARTEP  7-45  FOR  MECHANIZED  INFANTRY  AND  COMBINED 
ARMS  TASK  FORCE.  (Fig  2)  Look  particularly  at  the  column  labeled  TRAIN¬ 
ING/EVALUATION  STANDARDS.  Phrtsea  like  "Coordination. . .will  support...", 
'must  be  responsive",  "without  sustaining  excessive  casualties",  place 
a  substantial  responsibility  on  the  evaluator.  Now  look  at  a  roughly 
comparable  version  of  an  engagement  simulation  ARTEP:  At  least  two  major 
revisions  in  ARTEP  concept  can  be  seen.  (Fig  3) 

o  The  standards  and  the  rating  coluans  have  been  eliminated.  In 
their  places  ate  a  performance  data  and  results  section.  The  measures 
are  quantitative:  e.g.,  time,  range,  casualties. 

o  Major  objectives  have  been  further  analyzed  into  intermediate 
objectives.  For  example,  the  task  of  eliminating  enemy  resistance  has 
been  analysed  into  the  weapons  systems  Involved  and  then  further  broken 
down  into  weapon  aysteme  sub-tasks. 


What's  happened  to  the  standards?  This  particular  model  cf  ARTEP 
candidly  admits  that  we  don't  know  how  to  handle  the  standards  problem 
yet  and  moves  the  problem  to  one  side,  until  scientific  progress  in  this 
area  provides  some  useful  methodology.  Eaq>hasis  shifts  here  away  from 
GO/NO  GO  type  of  evaluation.  Emphasis  is  placed  instead  on  obtaining  a 
%  rich,  detailed  description  of  the  beha-'iors  involved  in  two  sided  coobat. 

|  That  emphasis  leads  to  two  essential  questions: 

I  o  Whst  patterns  of  behavior  can  we  extract  from  the  various  per- 

|  formance  measures  which  will  have  diagnostic  value?  We  have,  fo»  example, 


Vehicles  immobilized 


a  particular  interest  is  showing  the  connection  or  correlation*  among 
tactical  movement*  (i.e.,  position  by  ti me)*  proceaseo  such  as  firs:: 
enemy  detections,  and  outcome  measures  such  as  casualties  inflicted  or 
sustained. 

o  What  performance  trade-offs  ccn  we  Identify  and  measure?  A 
commander  fluty  deliberately  sacrifice  cover  and  concealment  in  order  to 
fight  more  aggresively  or  move  rore  quickly  towards  some  tactical  objec¬ 
tive.  the  significant  ait!  diagnostically  useful  measurement  concept 
would  be  risk'. taking  behavior,  instead  of  just  cover  and  concealment. 

Again  let  me  ask  the  question:  "What  has  happened  to  standards?"  We 
haven't  forgotten  about  than.  Until  the  etandarda  problem  is  solved,  an 
evaluation  system  is  not  criterion  referenced.  But  we  have  concluded 
that  tome  Imaginative  and  fresh  thinking  is  required  hare  along  with 
supporting  research,  the  concept  which  we  are  currently  working  on  can 
be  described  as  situation-specific  forecasting  of  the  dynamics  of  an 
engagement  simulation  exercise  along  with  various  tactical  processes  and 
out cotaes  such  as  casualties.  I'll  say  a  little  more  about  this  in  a 
few  minutes . 

The  second  step  in  the  evolution  of  an  evaluation  system  is  to  define 
performance  date  requirements  and  data  processing  techniques.  The  model¬ 
ing  of  Step  1  can  provide  the  general  guidance  for  this  step.  But  as>r* 
specifically  our  approach  has  been  to  identify  essential  elements  of 
analysis  (EEAs)  and  then  to  produce  taeasurea  of  effectiveness  (MOEa)  by 
phase  cf  combat.  This  is  consistent  with  the  ES  ARTEP  model.  Under 
Contract  to  ARI,  Human  Syrteoa,  Inc.  of  Heschls  (HSI)  hau  generated  a 
computer  listing  of  EEAs  along  with  amthods  for  coding,  processing,  and 
displaying  tho  results  of  a  computer  analysis  of  a  tactical  sup4.  Fig  A 
shows  the  initial  list  which  HSI  generated.  What  the  data  file  doss, 

In  effect,  la  to  describe  and  display  the  tactical  movements  of  two 
opposing  combat  teams  Alpha  and  Bravo,  Involved  In  an  Engagement  Simu¬ 
lation  exercise.  The  list  indicates  which  fire  elements  are  In  what 
locations,  at  what  tine,  what  the  terrain  la  like,  whether  or  not  there 
are  targets  of  opportunity  and  what  casualties  are  resulting  from  direct 
and  indirect  fire. 

Currently,  these  EEAs  are  being  put  together  in  various  ways  to  pro¬ 
vide  Measures  of  Effectiveness  (HOF.s)  for  each  of  the  phases  of  rein¬ 
forced  platoon  attack  mission.  This  mission  will  be  the  basis  for  a 
developmental  test  at  Ft  Carson  in  January'*  The  Phases  will  be: 

o  Preparation  (i.s..  Planning) 

o  Pr* -Engagement  {i.e..  Movement  to  Contact) 

o  Engageaent  (Hostilities) 

o  Post -Engageaent  (Post-Attack  Security) 


''Hansen,  D.M.  and  Drewfs.  Small  Unit  Data  Input  Structure  and  Graphic 
Support  Systes.  Interim  Report.  Human  Systems,  Inc.,  28  June  1977 
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Socae  uxamplea  of  Measures  of  Effectiveness  which  we  are  looking  at 
might  concretise  what  1  m  saying:  Fig  5  shows  an  MOE  for  Tactical 
Formation  in  the  Pre-Engagement  Phase.  An  example  from  the  Engagement 
Phase  is  shown  in  Fig  6.  In  this  case  an  MOE  for  enemy  detection  is 
illustrated. 

The  methods  for  processing  data  in  the  HSI/ARI  data  file  are  some¬ 
what  constrained  by  their  small  quantity.  Tactical  exercises  unlike 
most  individual  tasks  require  several  hours  if  not  whole  days  to 
and  are  very  costly  in  manpower  and  supplies.  Consequently,  dwta  are 
relatively  scarce  and  dc  not  readilj  lend  themselves  to  sohpisticated 
multivariate  analysis.  Therefore,  until  s  large  data  basa  is  built  up, 
we  will  probably  not  be  able  to  do  much  beyond  cross  tabulations  of  fre¬ 
quency  counts,  this  is  the  tact  which  we  have  been  taking  so  far.  But 
such  a  tact  is  consistent  with  ovr  near  term  goal  which  is  to  build  up 
experience  with  objective  measurement  of  tactical  exercises  and  to  learn 
which  measures  are  most  useful  for  diagnosis  of  training  deficiency. 

Having  lefinsd  data  requirements  and  even  modest  data  processing 
approaches,  the  next  step  In  evolving  an  evaluation  system  Is  to  define 
data  collection  methods.  From  a  technological  point  of  view  collection 
of  position  location  information  is  our  hottest  chestnut.  The  problem 
is  a  very  critical  one  because  position  location,  i.e,,  tactical  trajec- 
t'vj'y,  is  the  foundation  'f  the  HSI/ARI  data  sub-system.  And  without 
good  position  data,  that  eub-systex  is  a  house  of  cards. 

When  we  began  researching  the  ES  evaluation  problem  several  years 
ago,  w«  anticipated  access  to  army  Instruct  nted  ranges.  We  soon  dis¬ 
covered  that  such  ranges  were  few,  far  between,  expensive  to  operate, 
and  mostly  unavailable.  As  a  result,  w«  began  a  small  study  of  low-cost 
portable  ■Ucftiei  1  v«s  to  such  facilities  as  those  st  th:*  Ccttbs*  devel¬ 
opments  Experimentation  Command  and  at  Ft  Hood.  The  study  was  parti¬ 
cularly  geared  to  supporting  the  upcoming  Ft  Carson  test.  The  study, by 
Behavior  Technology  Consultants ,  Inc.,  looked  at  optical  triangulation, 
optical  ranging,  unattended  ground  sensors  and  a  number  of  radio  ranging 
techniques.  It  recommended  a  radar  rangl  system  which  appears  to  be 
portable,  relatively  low  cost  and  sufficiently  accurate  but  which  is 
still  beyond  our  resources  and  which  could  not  be  put  together  in  time 
for  Carson  anyway  .  As  a  leoult ,  we  are  developing  some  labor  intensive 
strategies  for  plotting  tactical  movements.  These  will  involve  inten¬ 
sive  map  and  terrain  training  of  data  collectors,  and  systematic  cross 
checking  of  results.  In  sddition,  the  HSI  methodology  Includes  some 
techniques  for  screening  bad  position  data  and  estimating  missing  points. 

If  we  can  succeed  in  adequately  solving  the  d* lemmas  posed  by  Steps 
1,  2,  and  3  of  an  ES  evaluation  system  we  will  have  achieved  %  great 

50'Heeron,  M.K.,  Howell,  W.Y.,  Frailer,  T.W.,  and  Johnson,  B,  Field 
Measurement  and  Data  Collection  System  for  Engagement  Simulation  Field 
Exercises.  Final  Report.  Behavior  Technology  Consultants,  Inc. 

1  October  1977. 
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TEAM  IDENTIFIER/  HEX  NUMBER/  AN'J  TIME 


FIG  6:  EXAMPLE  OF  MEASURE  OF  EFFECTIVENESS 


Locations  of  Target, 
Location  of  Detector. 
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deal.  However,  ve  still  won't  have,  in  ay  opinion,  a  criterion-referenced 
system.  Bring  able  to  collect  various  kinds  of  process  and  product  data, 
and  being  able  to  relate  these  data  to  ea*-';  other  is  very  critical.  But 
their  interpretation  and  usefulness  for  training  diagnosis  is  incomplete 
without  performance  benchmarks  or  standards.  The  problem  of  standards 
has  been  avoided  for  two-sided  combat  training  exercises  because  such 
exercises  are  situation-specific  and  involve  a  very  complex  and  not  well 
understood  set  of  variables. 

Accordingly,  we  have  underway  a  basic  research  program  to  explore 
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several  years  ago  by  Litton  under  contract  to  ARI.  This  model,  the  UNIT 
PERFORMANCE  ASSESSMENT  MODEL  (UP AM)  used  the  policy  capture  technique 
to  generate  indices  of  combat  proficiency.  The  index  values  resulted 
from  a  linear  combination  of  variables,  mostly  reflecting  various  casualty 
measures.  Commanders'  forecasts  of  these  meanures  for  an  upcoming  exer¬ 
cise  were  to  provide  benchmarks  against  which  actual  data  were  tc  be 
compared. 

A  more  recent  modeling  effort  is  called  COMBAT  OPERATIONS  TRAINING 
EFFECTIVENESS  ANALYSIS  (COTEAM).  This  effort  picks  up  on  the  concept  of 
situation-specific  forecasting  to  provide  performance  benchmarks,  but 
develops  the  concept  further  In  some  ways  we  think  are  quite  significant. 

COTEAM  hopes  to  do  several  key  things: 

(1)  Define  methods  for  forecasting  products,  processes,  and  dynamics 
of  ES  exercises  in  a  situation-specific  context.  The  UPAM  system  required 
forecasting  but  left  the  problem  up  to  the  COs  own  devices. 

(2)  Partial  out  non-t raining  effects  such  as  terrain,  mission  type, 
foitv  ratios  and  to  do  kg  by  addressing  tve  kind—  of  benchmarks 

o  Training  System  Referenced 

o  Combat  Referenced. 

The  curve  in  Fig  7  suggests  why  you  need  two  types  of  benchmarks. 

Imagine  that  you  could  unambiguously  define  training  system  bench¬ 
marks  (1-5)  representing  various  points  In  n  training  cycle,  e.g., 

1  »  entry  level  performance,  5  *  final  stage  in  unit  training.  Now 
imagine  that  you  could  define  various  seta  of  operational  conditions  (Sj, 
$2,  S2)  where  such  variables  as  weapon  mix,  terrain  type,  doctrine,  force 
ratio  define  the  aet».  Further  imagine  that  we  could  generate  perfor¬ 
mance  curves  as  functions  of  training  lsvel  and  operational  condition 
set.  What  conclusions  could  you  draw  from  Figure  7: 

(1)  The  conditions  of  St  are  such  that  training  effects  are  com¬ 
pletely  overwhelmed.  As  a  training  manager  you  would  avoid  Sj  since  it 
does  not  allow  for  a  differentiation  across  levels  of  training. 
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(2)  S2  <nd  on  the  other  hand  would  both  be  potentially  useful  to 
the  treining  manager. 

(3)  The  individual  who  sits  above  the  training  Manager,  and  balances 
training,  force  developaent,  and  doctrine  would  probably  exclude  both  S} 
and  S2  **  potential  training  conditions  because  neither  sec  of  conditions 
pansies  training  to  bring  combat  units  up  to  an  acceptable  level  of  readi¬ 
ness  . 

But,  how  do  we  go  about  actually  generating  the  benchmark  curves? 

The  answer  to  that  question  is  the  subject  of  current  research.  We  have 
expectations  of  developnen tally  testing  some  techniques  at  Carson  in 
January,  e.g.,  the  Delphi  Technique  as  a  way  of  systematically  extracting 
predictions  from  experts.  Other  possible  techniques  would  be  combat  board 
games  md  computer  simulations.  We  collected  some  preliminary  data  on 
forecrstlng  during  a  developmental  test  of  rifle  squad  engagement  simu¬ 
lation  Iteut  April  at  Ft  Ord  without  benefit  of  the  Delphi  method  or 
board  games.  We  wanted  to  get  a  feeling  for  the  kind  of  data  which  might 
result . 

Fig  8  describes  the  scenario  end  instructions  which  were  given  to 
subjects  for  a  squad  movement  to  contact.  The  subjects  were  NCOs  acting 
as  squad  leaders. 

Fig  9  shows  the  kinds  of  forecasting  that  was  dons  by  ths  NCO'a  and 
the  data  that  resulted.  Figs  10  end  11  show  scenario,  instruction,  and 
resulting  data  tor  s  hasty  defense.  Generally,  our  impression  was  that 
forecasting  could  be  done  wlti  tome  reliability  and  that  the  task  of 
forecasting  for  different  assumed  training  levels  was  not  an  unsurmount- 
ab It  one.  Our  subjects  did  seem  to  be  able  to  discriminate  expected 
tactical  performance  across  assumed  training  levels. 

I've  tried  to  provide  a  broad  and  very  surface  view  of  a  complex  re¬ 
search  program.  I  really  have  not  done  justice  to  the  scope  of  effort 
Involved.  Some  indication  of  the  size  of  the  effort  is  its  staffing. 
Approximately  16  AR1  behavioral  scientists  with  advanced  degrees  are  par¬ 
tially  or  fully  involved  with  the  prograM.  They  art  supported  by  the 
services  of  four  private  behavioral  research  companies.  The  very  active 
and  indispensable  support  of  our  TRADOC  sponsors  probably  adds  another 
five  professional  man  years^. 

What  the  pay-off  for  this  effort  will  be,  1  cannot  predict.  But  its 
significance  lies  at  least  partially  in  Us  potential  contribution  to  the 
Army's  proposed  National  Training  Center  at  Ft  Irwin,  California.  Large 
sums  of  money  are  likely  to  be  invested  in  the  production  of  a  very  so¬ 
phisticated  instrumented  range,  capable  of  genersting  enormous  quantities 
of  high  resolution  data.  If  the  Army's  capacity  to  select,  process,  and 
interpret  thosa  data  for  training  purposes  doss  not  catch  its  capacity  to 
supply  the  hardware  and  engineering  involved  in  instrumenting  a  range, 
the  Ft  Irwin  concept  may  not  reach  its  full  potential. 


5  Our  sponsors  are  the  Training  System  Manager  fc .  Tactical  Engagamont  Simu¬ 
lation  Systems,  Ft  Euetis,  and  The  Directorate  of  Training  Developments, 

Ft.  Knox. 
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FIG  8:  movement  to  contact  against  an  op 

IQ  man  at tacking  squad  (testing  squad) 

•4  man  (standard)  defense 

Scenario:  A  ten-man  squad  is  the  point  element  of  thi  platoon  in  a 
movement  to  contact.  The  squad  will  know  that  they  can  expect 
contact  at  any  moment.  They  will  have  just  crossed  a  danger  area 
where  they  encountered  sniper  fire,  without  taking  any  losses. 

The  squad  is  now  approaching  an  enemy  OP,  constating  of  four  men 
with  a  machine  gun  in  well  concealed  positions.  Time  t  *  C  occurs 
as  the  squad  clears  the  danger  area. 


Instructions 

Your  own  opinions  and  estimates  are  being  requested.  This  is  NOT  a 
test  of  your  personality;  the  data  will  be  used  strictly  for  scientific 
purposes. 

Assume  that  all  members  of  your  squad  have  only  been  through  Basic 
Combat  Training  (BCT).  Now  on  the  next  two  pages,  go  down  the  first 
(BCT)  column,  and  put  your  answer  in  each  box  for  each  question.  If 
a  more  detailed  answer  is  called  for,  use  the  reverse  side  of  the  paper. 

Now  assume  th^t  your  squad  has  recently  passed  Infantry  Level  2  ARTEP, 
shown  in  column  two.  Anrwer  each  of  the  questions  again  for  this  column. 

Assuming  three  days  of  SCOPES  traiuing,  answer  the  questions  again  in 
the  third  column. 

r'lnally,  as  busk  that  all  members  of  your  squad  are  combat  experienced 
Rangers,  and  answer  all  questions  In  the  boxes  for  the  fourth  column. 
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FIG  9:  Forecasts  of  Infantry  Squad  Movement  to  Contact 

Level  of  Training  of  Tested  Squad 


FIG  9:  Forecasts  of  Infantry  Squad  Movement  to  Contact  (cont) 


FIG  10:  HASTY  DEFENSE 
7  man  attacking  squad  (controlled  aggressor) 

10  man  defending  squad  (tested  squad) 

Scenario:  A  ten-man  squad  established  a  hasty  defense  as  part  of  a 
l&rger  platoon  defensive  perimeter.  They  will  have  approximately  15 
minutes  from  the  delivery  of  the  frag  order  to  establish  the  hasty 
defense.  At  that  time  an  enemy  counterattack,  consisting  of  7  men 
with  a  machinegun  will  begin  their  approach  toward  the  defensive 
positions.  The  counterattack  movement  will  begin  at  a  position 
approximately  100  meters  from  the  defense.  Time  t  «*  0  occurs  with 
the  delivery  of  the  frag  order. 


Instructions 

Your  own  opinions  and  estimates  are  being  requested.  This  is  NOT  a 
test  of  your  personality;  the  data  will  be  twad  strictly  for  scientific 
purposes . 

Assume  that  all  members  of  your  squad  have  only  been  through  Basic 
Combat  Training  (BCT) .  Now  on  the  next  two  pages,  go  down  the  first 
(BCT)  column,  and  put  your  answer  In  each  box  for  each  question.  If 
n  more  detailed  answer  is  called  for,  use  the  reverse  side  of  t::e  paper. 

Now  assume  that  y«:*ur  nqur.d  has  recently  passed  Infantry  Level  2  ARTEP, 
shown  in  >vlu£*i  two.  Answ.r  each  of  the  questions  again  for  this  column. 

Assuming  three  days  of  SCOPES  training,  answer  the  questions  again  in 
the  third  column. 

Finally,  assume  that  all  members  of  your  squad  are  combat  experienced 
Rangers ,  and  answer  ell  queetions  In  the  boxes  for  tho  fourth  column. 


Forecasts  of  Squab  Hasty  Defense  (cont) 

Level  of  T reining  of  Tested  Squad 
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RELIABILITY  IN  MEASURING  UNIT  PERFORMANCE 


A  central  problem  in  ail  evaluations,  and  especially  in  evaluations 
of  combat  units,  is  hew  to  incorporate  characteristics  of  good  Manure- 
sent  in  the  evaluations.  Characteristics  of  good  MssureMnt  include 
comprehensiveness.  cost-effecti yen***.  validity,  and  reliability.  Our 
concern  in  this  paper  is  with  reliability;  for  without  reliability, 
comprehensiveness  is  of  little  value,  and  cost-effectiveness  and  validity 
cannot  be  achieved. 

Reliability  refers  to  the  extent  to  which: 

1.  Two  or  nor*  Independent  observers  produce 
similar  results,  and 

2.  Measures  of  an  event  taken  at  on*  tiM  are 
identical  to  Masures  of  the  sane  event 
taken  at  another  t<M. 

The  performance  of  coabat  units,  at  least  in  the  Army,  is  increas¬ 
ingly  being  evaluated  in  the  context  of  large-scale,  free  play  simulated 
coabat  exercises.  The  ARTEP  (Army  Training  and  Evaluation  Program)  is 
an  example.  The  results  of  performance  evaluation*  in  simulated  coabat 
are  used  by  policy  makers  in  decision*  about  training  needs  and  combat 
readiness.  Given  the  importance  of  decision*  about  training  needs  and 
coabat  readlnsas,  and  given  the  dependence  of  these  decisions  on  unit 
performance  evaluations,  a  question  naturally  arises  as  to  how  to  maxi¬ 
mize  reliability  in  measuring  unit  performance. 


Purpose 

The  purpose  of  this  paper  is  to  present  hypotheses  about  variables 
that  affect  the  reliability  of  unit  performance  measurement,  and  to 
outline  research  for  testing  the  hypotheses. 

Sources  of  Measurement  Reliability 
Maaaurement  can  be  viewed  as  consisting  of  three  phases: 

1.  Observer  Preparation. 

2.  Observation. 

1.  Recording  and  Reporting. 
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Variable*  chat  affect  measurement  reliability  are  at  work  within  each 
of  the  three  phaaea  of  mseauresent  —  variables  that  affect  the  extent 
to  which  two  ct  sore  observers  produce  similar  measurement  results,  and 
the  extent  to  which  Measures  taken  at  a  given  ties  are  representative 
of  measures  taken  at  another.  Hypotheses  about  the  variables  in  each 
of  the  three  aeasureaent  phases  fallow. 

Observer  Preparation 

Reliability  of  measurement  will  increase  with  the  consistency  or 
uniformity  of  under* tending  among  observer*  about  the  rules  of  obearva- 
tion  and  recording.  Ideally,  observers  should  be  standardised,  and 
measures  should  be  taken  to  assess  the  degree  to  which  they  have  been 
standardized.  Measurement  reliability  may  be  increased  by  manipulating 
the  following  variables  in  the  observer  preparation  phase: 

1.  Specificity  of  instruction*.  Rslisbiiity  is 
likely  to  be  greeter  when  the  instructions  to 
observers  arw  highly  specific  then  when 
instructions  ere  general  end  loosely  stated. 

2.  Timing  of  lnatnctlona.  Instructions  to 
observers  should  not  be  given  so  far  in 
advance  of  observation  as  to  permit  forget¬ 
ting,  or  so  late  ee  to  preclude  learning. 

3.  Practice  in  observing  end  recording.  Measure¬ 
ment  reliability  will  be  greater  when  observers 
have  practice  measuring  and  recording  the 
events  of  interest  then  when  they  have  not. 

Yhe  practii  j  variable  interacts  with  tiedn'i  of 
inat ructions.  In  that  lnotructions  to  observers 
should  be  given  far  enough  in  advance  of 
observation  to  allow  time  for  practice. 

4.  Testing  observers.  Measurement  reliability  car. 
be  indirectly  increased  by  the  use  of  testa  to 
sake  sure  that  observers  are  capable  of  perform¬ 
ing  whatever  measurement  operations  will  be 
required  of  them. 

Observation 

Even  with  very  careful  observer  preparation  and  totally  standard¬ 
ised  observers,  measurement  reliability  will  be  effected  by  variables 
at  work  curing  the  observation  (measurement)  process. 
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Properties  of  the  event*  or  things  to  be  £e**'>rid  can  effect 
measurement  reliability.  Measurement  of  unidimensional  events  will, 
for  exaaple,  be  sore  reliable  then  aeesureaent  of  multidimensional 
events  (ell  other  things  being  equal).  This  is  related  to  perceptual 
"clutter,"  or  limits  on  observers'  lnxormation-proceaalng  abilities. 
Within  rather  broad  Halts,  observers  who  are  asked  to  aake  large 
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reliable  results  than  will  observers  asking  smaller  nunfeers  of  obser¬ 
vations  . 


Another  property  of  the  events  or  things  to  be  measured  that 
affects  aeasureaent  reliability  is  stability  (or  its  opposite, 
transience).  The  results  of  measuring  the  diameter  of  a  wooden  ball 
will,  for  exaaple,  be  more  reliable  than  will  the  results  of  measuring 
a  aercury  "ball"  —  ones  again,  all  other  things  being  equal. 


Other  properties  of  events  to  oe  measured  thst  will  Influence 
reliability  arc  tlme-sherlng,  noise,  and  "observability";  thst  is, 
aeasureaent  reliability  may  be  expected  to  decrease  with  the  extent 
to  which  the  observed  event  is' 


1.  Time-shared  with  other  events. 

2.  Embedded  in  noise. 

3.  Not  direct’.-  observable. 

Strategies,  rules,  and  procedures  for  measurement  also  affect 
reliability.  Observers  may  be  expected  to  perfora  more  reliably, 
*or  example,  to  the  extent  that  they  are: 

1.  Required  to  aake  comparative  rather  than 
absolute  Judgments . 

2.  Given  s  well  defined  standard  stimulus. 

3.  Alerted  as  to  ’./hat  to  observe  (anticipate 
likely  errors’ . 

4.  Given  the  opportunity  to  observe  an  event 
more  than  once. 

5.  Given  scoring  aids  or  template* . 

6.  Required  to  measure  only,  and  not  process 
measurement  results. 
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Recording  gad  Reporting 

Ev«n  with  adequate  observer  preparation  and  careful  control  of  the 
maaauremeot  process,  asasuresent  reliability  will  be  affected  by  variables 
operating  during  the  recording  and  reporting  of  neasuresumt  results. 

There  variables  include; 

1.  Timing.  Measurement  reliability  will  increase 
with  decreased  time  between  observation  of  the 
event  of  interest  and  recording  of  recults. 

2.  Design  of  recording  form.  Well  designed 
data  recording  fora*  minimi  re  the  aseunt  of 
judgment  and  decision-making  required  for 
their  use,  end  thereby  Increase  the  reliability 
of  recorded  results.  Simplicity  in  date- 
recording  forms,  for  example,  minimise  date- 
recording  time,  and  therefore  allows  more  tfme 
for  observation. 

Unit  performance  measurement  probably  is  unreliable  because  of  the 
influence  of  all  of  the  variables  mentioned  above.  These  variables 
serve  to  decrease  the  reliability  of  operations  as  simple  and  straight¬ 
forward  as  measuring  length  with  a  ruler.  The  considerable  complexity 
of  free  play  simulated  coafcat  guarantees  that  measurement  reliability 
problems  will  be  great. 

I.i  the  observer  preparation  phase,  for  example,  observers  may  not 
be  standardised  for  any  number  of  reason a.  Instructions  for  smasurement 
may  ba  too  general,  and  may  not  be  given  at  the  right  time  Observers 
may  not  have  enough  practice  to  permit  performing  their  smasurement 
duties  in  accordance  with  the  intent  of  the  test  designers.  And  practi¬ 
cal  constraints  (e.g.,  time,  money)  may  preclude  ascertaining  whether 
observers  ere  capable  of  performing  their  smasurement  duties  before 
"turning  them  looce." 

In  the  observation  phase,  observers  may  be  required  to  make  simul¬ 
taneous  judgments  along  more  dimensions  chan  their  sensory  apparatus 
can  comfortably  handle.  The  seasurement  instruments  may  permit  too 
much  subjectivity  and  expertising.  Strategies  for  seasurement  may  be 
inappropriate  (single  rather  than  multiple  observations,  for  example). 

And  the  nature  of  the  required  Judgments  and  decisions  may  invite 
unreliability. 

In  the  recording  and  reporting  phase,  unreliability  may  ba  pro¬ 
moted  by  the  length  of  time  between  observation  and  recording  of  results, 
and  by  formats  for  recording  results. 
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The  possible  influences  of  Che  variables  discussed  above  demani 
Chat  research  be  undertaken  on  nethods  for  improving  the  reliability 
of  unit  performance  maasuremant,  for  measurenent  without  reliability 
will  lead  tr  wrong  decisions  about  training  needs  and  about  readiness. 


Photography  and  Measurement  Eel lab  ill  tv 

The  conduct  of  neasurenent  reliability  studies  requires  that  what** 
ever  is  to  be  observed  end  Manured  (simulated  combat,  for  exampla) 
must : 


1,  ''Sit  still"  long  enough  to  permit  observers 
to  Mke  the  required  measures. 

2.  Re  presented  uniformly  or  varied  syscerati- 
cally  for  various  groups  of  observers. 

These  two  requirements,  and  the  high  coat  of  field  studies  using  simu¬ 
lated  combat,  make  the  conduct  of  field  studies  of  Mssurement  relia¬ 
bility  impractical.  The  requirements  for  "sitting  still,"  for  uniform 
or  systematically  varied  presentation,  and  for  low  cost  can  be  met 
by  the  use  of  photography. 

Motion  pictures  of  simulated  coebat  can  be  Mde,  using  real  combat 
vehicles  or  models.  Models  seem  preferable  for  two  reasons.  The  first 
is  low  cost.  The  second  ft  that  research  ou  reliability  of  Masurlng 
unit  performance  does  not  require  perfect  fidelity  or  realism  In  the 
events  to  be  observed  and  Manured.  As  noted  earlier,  the  main  require¬ 
ment  is  for  s  set  of  events  that  can  be  presented  uniformly  to  various 
observers,  or  varied  In  accordance  wi'h  retirement#  of  the  experimental 
design. 

Subtle  errors  in  tactics  and  operations  can  be  deliberately  incorp¬ 
orated  into  motion  pictures,  for  the  purpose  of  producing  variability 
in  observers'  response  to  events  presented  in  the  film.  And  by  editing 
videotape  versions  of  the  film,  the  amount  of  info  .oast  ion  available  to 
various  groups  of  observers  can  by  systeMtlcally  varied. 

Studies  of  reliability  in  unit  performanca  measurement  should  take 
the  following  general  form:  A  set  of  events  is  rs^le^ted  for  observation 
and  measurement  (e.g.,  a  part  of  the  ARTEP) .  Several  groups  of  subjects 
view  the  events,  observing,  measuring,  and  evaluating  according  to 
Instructions  and  experlMntal  conditions.  Systematic  variations  are 
Introduced  in  variables  in  any  or  all  of  the  three  phases  of  MasurcMnt. 
As  implied  earlier,  variations  could  be  introduced  in  the  kinds  of 
instructions  given  to  observers,  the  specificity  of  the  instructions, 
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amount  of  practice  given  to  observers,  kinds  of  instruaents  end  measure¬ 
ment  strategies,  end  so  forth.  In  ell  '•.ases  the  dependent  vetirble  is 
en  index  of  inter-observer  reliability;  e.g.,  a  simple  "percent- 
sgreement"  score  to  indlcste  the  extent  to  which  observers  produce  simi¬ 
lar  result*  measuring  the  seae  things.  Variables  that  affect  reliability 
are  identified,  and  c*n  be  incorporated  Into  "how-to"  literature  for 
reliable  unit  performance  aeasureaent. 

The  conduct  of  research  along  the  lines  suggested  above  seeae  war¬ 
ranted  ,  because  the  results  would  lead  iaaediately  to  action  recommenda¬ 
tions  for  improving  measurement  reliability,  and  could  be  incorporated 
directly  into  any  prograa  for  measuring  unit  performance. 


PERSONNEL  TURBULENCE  AND  TIME 
UTILIZATION  IN  AN  INFANTRY  DIVISION* 

Or  Hilton  M.  Blalek, 

Diana  Zapf 
Wendee  McGuire 


Introduction 


In  Its  attempts  Co  comply  with  recent  DOD  policy  "that  learning 
objectives  which  can  be  accomplished  more  economically  in  the  opera¬ 
tional  unit,  and  without  unacceptable  degradation  of  unit  readiness, 
uhould  be  provided  as  OIT  rather  than  as  individual  training",  the 
Army  has  instituted  a  number  of  R&D  efforts  designed  to  decentralize 
training.  A  number  of  these  efforts,  and  one  l  have  been  Involved  in 
for  the  past  two  years,  utilizes  the  squad  leader  as  a  primary  instruc¬ 
tor.  The  idea,  in  addition  to  decentralization,  is  to  enhance  the 
leadership  role  of  the  squad  lender  by  salting  him  primarily  responsible 
for  the  individual  skill  proficiency  of  the  men  under  his  command. 

For  an  instructional  system  like  this  to  work,  some  sort  of  per¬ 
sonnel  stability  would  seem  necessary.  A  squad  leader  needs  sufficient 
time  to  learn  the  strengths  and  weaknesses  of  his  men,  time  to  create 
a  group  identity  and  cohesion,  and  time,  of  course,  to  provide  instruc¬ 
tion.  How  stable  then,  are  TQ/E  companies  and  squads?  That  is  one 
question  we  attempted  to  unswer.  The  second  question  had  to  do  with 
the  utilization  of  time:  "How  much  time  does  a  squad  leader  typical¬ 
ly  have  to  actually  devote  to  training?"  These  two  questions  guided 
the  design  and  conduct  of  the  study  I  will  now  describe  to  you. 


Approach 


To  investigate  these  questions,  two  main  sources  of  Information 
were  used, 

e  The  manning  reports  submitted  monthly  by  each  company 
to  the  battalion  headquarters. 

e  A  large  sample  of  15-minute-by- 15-minute  first-hand 
observational  records  of  the  daily  activities  of 
individual  squad  members. 

Manning  reports  described  the  flow  of  personnel  in  and  out  of  the  four 
companies  in  the  sample,  as  well  as  the  duty  positions  and  MOSs  of  each 
aan  in  the  company.  The  second  data  source  —  observations  —  pro- 
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vided  information  concerning  what  sen  In  a  sample  of  squads  selected 
f roe  the  four  companies  were  doing  on  a  quarter  hour  by  quarter  hour 
basis  and  how  long  they  were  doing  it. 

The  operational  unit  in  this  instance  was  «  CONUS  infantry  divi¬ 
sion.  Observations  and  sub-unit  sampling  focused  on  11B  and  11C  MOSs 
because  these  MOSs  had  been  selected  as  the  inltisl  content  of  the 
individual  skill  training  system  under  developsient. 

The  Sample 


The  saaple  selected  reflected  our  interest  in  studying  turbulence 
on  both  the  individual  and  unit  level,  for  MOSs  UP-  and  11C.  The.  sample 
of  companies  and  squads  within  those  coaipanles  was  chosen  to  represent: 


SLIDE  1  HERE 


Resul ts 


I  will  report  first  the  analysis  of  the  data  available  from  the 
manning  reports  and  the  accompanying  information  obtained  b*  checking 
battalion  records.  Later  I  will  focus  on  the  date  obtained  frees  the 
daily  observation  phase  of  the  study.  Movements  in  and  out  of  companies 
and  squads  is  shown  in  Slid.2  2. 
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Slide  2  shows  that  after  4  months: 

242  were  no  longer  in  the  ^om^any. 

242  had  moved  to  another  squad  within  the  company. 

In  terms  of  stability,  at  the  squad  levei,  ve  see  that: 

522  were  in  the  same  squad. 

362  were  in  the  same  duty  posl;ion  In  the  same  squad. 

162  were  in  different  duty  positions  in  the  same  squad. 

As  backup  data  for  thlt  manning  report  information,  the  observers 
ve  employed  to  measure  time  utilisation  were  required  tc  record  the 
actual  names  of  squad  members  each  time  they  spent  the  day  with  the 
squad.  Averaging  the  results  from  observing  10  squads  gave  the  results 
shown  in  Figure  3. 


SLIDE  3  HERE 
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The  results  show  that  31%  of  the  original  s(,uad  members  left  the  squad 
over  a  tuo  month  period.  Assuming  a  linear  relationship,  there  is 
about  a  15X  turnover  pov  aonth.  Comparing  this  to  Figure  2  —  48%  left 
the  squad  over  a  four  aonth  period  giving  a  monthly  rate  of  12%  —  shows 
the  estlaate  to  be  quite  close.  Also  shown  are  the  number  of  "movements" 
—  th*  arrival  or  departure  of  a  aeaber  —  experienced  for  the  compos¬ 
ite  (average  of  the  ten  squads  observed)  squad-  The  total  ntasber,  8.67 
is  perhaps  a  more  sensitive  turbulence  indicator  than  simple  turnover 
(the  proportion  of  positions  initially  observed  that  are  held  by  someone 
else  at  th^  end  of  the  observation  period)  because  it  includes  individuals 
who  arrive  a'd  depart  between  the  initial  and  final  observation  points. 

We  will  reserve  cor.nent  on  these  findings  until  after  we  have  pre¬ 
sented  our  tiae  utilization  results. 


The  second  par;  of  the  study  -  measuring  dally  turbulence  and 
tiae  utilisation  -  involveJ  a  direct  observation  technique.  Four 
volunteers  ftoa  an  engineering  battalion  were  trained  as  observers* 

Each  observer  was  assigned  to  one  of  the  four  companies  comprising  the 
•ample,  Each  day  the  observer  would  were  with  the  company  at  aornieg 
formation  and  spend  vhe  reaa index  of  the  duty  day  with  a  designated 
squad  1 ,  following  then  wherever  they  went.  The  observer  carried  a  clip 
board  with  a  copy  of  the  observation  data  aheet  attached,  and  every  15 
minuter  recorded  the  activity  of  each  aeaber  of  the  squad  on  the  basis 
of  two  decisions:  (1)  which  of  six  major  activity  areas  ia  the  soldier 
Involved  in?  and  (2)  within  thaw  activity  area,  which  of  four  aodes  is 
he  in?  The  six  activity  areas  ate  described  in  Table  1,  along  with  ex¬ 
amples. 
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Within  each  of  these  activity  areas,  the  observed  person  was  categorized 
into  one  of  four  e-odes:  receiving  information,  performing  tasks,  wait¬ 
ing  to  receive  instruction,  or  enroutc  to  or  from  activity. 

During  the  training  of  observers,  reliability  chocks  were  conduct¬ 
ed  by  having  two  observers  observing  the  ssae  squad  for  a  full  day.  For 
over  VOX  of  those  time-units  observed  and  recorded,  both  observev.1  re~ 
corded  the  same  categories.  The  differences  that  did  occur  were  mostly 
in  the  "mode"  dimension,  whereas  the  major  activity  areas  seemed  e’jarly 
discrimlnable. 


*0n  t.h-bc  occasions  when  the  unit  was  engaged  in  night  training  exercises, 
fhe  erver  would  spend  the  night  observing  the  squad. 
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Using  this  data  collection  procedure,  the  activities  of  a  total 
of  166  observer  days  or  nearly  40,000  15-minute  time  units  were  re¬ 
corded*  I  will  now  talk  about  how  these  40,000  time  units  were  dis¬ 
tributed. 
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The  results  are  organised  to  show  how,  for  a  typical  or  average 
training  day,  non-trainins  day,  and  overall  average  day  (the  two  types 
of  days  combined)  time  and  people  are  utilised*  Slide  5  shows  the  total 
number  of  time  units  observed,  broken  down  by  number  and  percentage* 

Each  time  unit  represents  one  man  for  a  period  of  15  minutes*  For  exam¬ 
ple  2901  time  units  were  recorded  as  "unit  training"  during  the  training 
days.  This  number  te.  14X  of  the  20,626  time  units  observed  during  train¬ 
ing  days,  Note  that  a  category  labeled  "absence"  (turbulence)  is  includ¬ 
ed  in  the  total  number  of  time  units.  This  is  the  number  of  time  unite 
lost  because  individuals  who  were  officially  available  for  duty  were  not 
in  fact  present.  These  absences  could  range  from  one  time  unit  during 
the  day  to  the  entire  day.  The  numbers  shown  in  Table  2  indicate  that 
18%  of  the  time  units  available  during  training  days  (152  during  non¬ 
training  days  and  16%  overall)  were  unused  because  of  absences. 

Perhaps  a  more  direct  way  of  portraying  the  results  is  to  show 
what  the  typical  infantryman  spends  his  time  doing  in  a  typical  duty 
day,  and  how  much  time  he  spend  doing  it. 
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The  next  slide  shows  this  for  a  training  day;  i'll  not  load  you  down 
with  too  much  by  showing  the  non-training  and  combined  day  but  obviously 
they  show  less  training  time. 

Turning  to  the  question  of  vhat  soldiers  do  when  they  arc  absent 
from  the  squad  (on  the  average  of  one  hour  23  minutes  per  dny)  a  break¬ 
down  of  their  activities  appears  in  the  next  slide. 
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Results  are,  again,  shown  for  a  training  day,  a  non-training  day,  and 
both  days  combined.  The  najor  turbulence  causing  activities  seem  to  be 
work  details  (27%  of  the  absences  during  a  training  day  are  «  results 
of  this)  and  military  schooling  (another  21%).  The  remainder  of  the 
time  is  a  result  of  the  other  activities  listed  on  the  slide.  Absences 
from  the  squad  occur  as  frequently  during  actual  training  tine  (unit 
and  individual)  as  during  other  activities. 

One  final  analysis  shows  that  of  the  two  hourr  seven  minutes  desig¬ 
nated  as  training  (unit  and  individual)  spent  on  a  "training"  day,  the 
average  soldier  spends  57  minutes  of  that  time  actually  engaged  in  hands- 
on  performance  behavior.  He  spends  another  25  minutes  per  day  receiving 
Instruction.  The  last  slide 
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shews  how  the  remainder  of  his  actual  training  „ime  is  distributed. 


Comments 


First  a  methodological  comment.  I  think  this  direct  observation 
technique  for  obtaining  data  is  highly  useful  and  can  be  applied  any 
time  one  wishes  to  f lnd  out  whether  or  not  organisational  policy  changes 
do  in  fact  change  patterns  of  action  and  time  utilisation.  It  is  not 
difficult  to  train  observers  and  tnc  system  1j>  hardly  affected  by  their 
continuous  and  extended  presence. 

Turning  now  to  the  results  themselves 

Although  awareness  of  turbulence  and  efficient  use  of  time  is 
rucognlzed  and  widespread,  it  seems  that  this  close  look  st  s  3-4  month 
period  in  the  life  of  an  operational  unit  is  less  than  cos.forting. 
During  that  period,  overall,  less  than  25%  of  *he  time  is  actually  de¬ 
voted  to  training  and  that  time  about  one-third  is  lost  to  delays, 
movement,  and  other  minor  factors. 

Certainly  the  amount  of  movement  of  personnel  both  within  and  in¬ 
to  and  out  of  an  operational  unit  appears  excessive  and  the  obvious 
question  is,  "Is  it  all  necessary?"  Are  there  personnel  management 
policies  in  the  A»*my  which  contribute  to  this  movement?  Are  conrmandero 
usually  short-handed  and  therefore  needing  to  shuttle  people  around  to 
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fill  in  gaps  on  a  temporary  basis?  Are  administrative  and  support  re¬ 
quirements  given  greater  priority  than  training/operational  require¬ 
ments?  Is  the  inefficient  use  of  tiste  reported  above  a  consequence  of 
policy  decisions,  a  breakdown  in  line  of  command,  or  due  to  yet  other 
causes?  In  other  words,  is  the  picture  conveyed  in  this  study  inevitable 
or  can  certain  elements  be  identified  which,  if  modified,  would  change 
the  pattern  of  movement  and  time  usage  described  above. 

If  may  well  be,  however,  th»v.  the  pattern  revealed  in  this  partic¬ 
ular  kind  of  operational  unit  —  an  infantry  division  —  is  not  found 
in  other  kinds  of  Army  unitu,  operational  or  otherwise.  There  may  be 
something  special  about  com hat  arms  (or  infantry  specifically)  that 
exacerbates  the  problem.  Por  example,  combat  oriented  operational  units 
are  in  the  unique  position  of  never  (except  in  actual  combat)  having  to 
perform  the  jobs  they  are  trained  for.  Thus,  they  are  alwayo  in  "train¬ 
ing"  or  in  a  state  of  preparation.  The  distinction  between  such  a  unit's 
operation  and  its  preparation  or  training  for  that  operation  is,  at  beet, 
fussy.  If  may  well  be  that  this  unusual  circumstance  leads  to  greater 
turbulence  and  inefficient  time  uaage  as  compared  to  other  operational 
unite  (a  transportation  unit,  an  administrative  center  —  many  combat 
support  activitlea).  Another  example  of  a  factor  that  might  contribute 
to  turbulence  is  the  priority  treining  is  given,  vis  a  vis  the 

other  demands  placed  on  an  operational  unit;  housekeeping,  maintenance, 
unit  missions,  officer  career  requirements,  etc.  The  point  of  this  dis¬ 
cussion  is  that  a  case  can  be  made  for  investigating  the  causes  and  con¬ 
ditions  that  relate  to  turbulence  and  time  utilisation  because  (a)  it 
is  highly  likely  that  these  two  phenomena  are  related  to  organisational 
effectiveness  and  efficiency,  and  (b)  the  factors  underlying  them,  once 
identified  and  isolated,  can  probably  be  greatly  modified  ao  as  to  im¬ 
prove  effectiveness  and  efficiency.  It  would  appear  therefore  that  the 
need  for  further  direct  investigation  into  the  causes  und  amelioration 
of  these  phenomena  Is  warranted  while  attempts  to  design  training  systems 
which  can  handle  them  continues. 
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1.  Britadaa:  Two  coapanlas  aach  fro*  the  division**  two  atoauvtr 
brlgadas. 

2.  Rsttslions:  Four  of  ths  six  Mntuvtr  battalions  wars  raprasantad 
in  tha  saapla, 

3.  Kinds  of  coapanlas  that  contain  nost  of  tha  11>*  and  HCs: 

Rif La  and  coabat  support  coapanlas  war*  atudlad. 

4.  Tha  axis tint  ratio  of  3  rlfla  coapanlas  to  1  coabat  support  company: 
3  of  tha  coapanlas  in  tha  saapla  war*  rlfla  coapanlas,  tha  fourth 
was  a  coabat  support  coapany. 

5.  Kinds  of  platoons:  10  diffarant  platoons  within  tha  4  coapanias 
war*  studied,  on*  squad  froa  aach  platoon. 
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SLIDE  3 

AT  THE  START  OF  A  1X0  MONTH  OBSERVATION  PERIOD,  THERE  WERE 

oooooooo 

SQUAD. 

7.55  (100%) 

DURING  THE  NEXT  2  MONTHS: 

o  o  • 

V  V  ^  MEN  LEFT  m  SQUA0  ~ 

2.33(31%  of  tht  original  number) 

•  •  •  •  « 

ANO,  - ™  ▼  ▼  ▼  T  NEW  MEN  JOINED  THE  SQUAD, 

(4.78)  #  I 

OF  WHOM,  ™EK  LEFr  ™E  S(3UA0*  “ 

(1.561 


THERE  HERE 

©  ©  © 


#  ©  ©  © 


(8.67) 


"MOVEMENTS-  (In  and  out)  WITHIN 
A  SQUAD  OVER  THE  2-MONTH  PERIOO 
SLIGHTLY  MORE  THAN  ONE  PER  WEEK. 


AT  THE  EHO  OF  THE  2-MONTH  PERIOO,  THE  SQUAD  COMPOSITION  WAS 


(5.22) 


«J§  •  •  I 

♦  (3.22) 


TOTAL  MEN 


69%  of  the  original  number 


Figure  2.  Composite  Squad  level  Turbulence. 
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TABLE  1.  MAJOR  ACTIVITY  OBSERVATION  CATEGORIES 


ACTIVITY  CATEGORY 

EXAMPLES 

UNIT  TRAINING  (U) 

ARTEP 

Field  Exercise:  squad  ambush. 
Indoor  class  on  assembly  area  pro¬ 
cedures. 

Field  Exercise:  company  defense. 

Focuses  on  training  Indi¬ 
viduals  to  perfom  as  ambers 
of  a  tea*  or  unit. 

INDIVIDUAL  TRAINING  (NOS  Skills) (I 

) 

Weapons  qualification. 

Indoor  class  on  camouflage  tech¬ 
niques. 

Outdoor  class  on  Mine  detector 
training. 

EIB  training. 

Mortar  Crew  drill. 

Class  on  first  aid. 

Focuses  on  the  skills  (tasks 
which  the  Individual  needs  to  do 
his  job.) 

B MBMEK 

PT 

Unit  tea«  athletics. 

Physical  readiness  training. 

TEACHING  ACTIVITIES  (T) 

Teaching  or  assisting  In 
teaching  for  unit  or  Individual 
training. 

Teaching  a  class  on  land  naviga¬ 
tion. 

Demonstrating  how  to  set  up  a 
minefield. 

SUPPORT/GARRISON  (S) 

Activities  which  support 
training;  garrison  dutiss. 

Weapons  Issue  and  tum-ln. 
Maintenance  of  weapons,  equipment, 
vehicles. 

Maintenance  of  blllets/bulldlngs. 
Work  details. 

Parades 

Garrison  guard  mount. 

CQ 

PERSONAL  CARE  (P) 

Breaks 

Taking  showers. 

Changing  clothes. 

Authorized  activities  only. 
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TABLE  2.  DISTRIBUTION  OF  TOTAL  TIME  UNITS  BY 
MAJOR  ACTIVITY  AREAS. 


ACTIVITY  AREAS 

TRAINING  DAYS 

NON-TNG  DAYS 

ALL  DAYS 

NUMBER 

OF  TIME 
UNITS 

1  OF 

TNG  OAY 
TOTAL 

NUMBER 
OF  TIME 
UNITS 

1  OF  NON- 
TNG  OAY 
TOTAL 

NUMBER 
OF  TIME 
UNITS 

1  OF 

ALL  DAYS 
TOTAL 

UNIT  TNG 

2901 

141 

850 

041 

3751 

091 

INDIV  TNG 

3078 

151 

782 

041 

3860 

101 

INOIV  TNG  (PT) 

1722 

081 

1982 

101 

3704 

101 

SUPPORT/GARRISON 

6297 

311 

10621 

561 

16918 

431 

PERSONAL  CARE 

2832 

141 

1971 

101 

4803 

121 

TEACHING  ACTIV 

139 

on 

141 

on 

280 

on 

ABSENCES  (TURB) 

3657 

181 

2778 

151 

6435 

161 

TOTALS 

20.626 

\ 

19,125 

39,751 

NOTES 

1.  Average  number  of  men  per  Squad:  Training  *  8.03 

Non-Training  •  8.46 
All  Days  •  8.25 

2.  Time  units  are  recorded  from  the  official  start  of  the  duty  d*y  to  the 
official  close.  Lunch  time  Is  NOT  Included  as  a  recorded  tine  unit. 
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UNIT  TRAINING 

INDIVIDUAL  TRAINING 

PHYSICAL  TRAINING  (PT) 
TRAINING  OTHERS 

PERSONAL  CARE 

SUPPORT/GARRISON 

ACTIVITIES 

ABSENT  (TURBULENCE) 

TOTAL  OAY 


1  hr  02  Minutes 

1  hr  05  Minutes 

37  Minutes 
03  nlnutes 

1  hr 

2  hrs  14  Minutes 

1  hr  18  Minutes 
7  hrs  19  Minutes 


Figure  3.  Distribution  of  Time  Devoted  to  Major  Activities 
During  an  Average  TRAINING  Duty  Day. 
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TABLE  3.  -REAKOOWN  OF  ACTIVITIES  ENGAGED  IN  UHILE 
ABSENT  FROM  DUTY 


*  OF  TOTAL  TIME  ABSENT 

ACTIVITY 

TNG  DAY 

NON-TNG 

DAY 

COMBINED 

DAY 

MEOICAL 

10% 

03% 

07% 

PERSONAL 

04% 

01% 

03% 

MILITARY  EDUCATION 

21% 

28% 

25% 

PERSON^  EDUCATION 

08% 

04% 

07% 

DETAILS/CQ 

27% 

25% 

26% 

DISCIPLINARY 

0 

11% 

06% 

LEAVE 

08% 

11% 

10% 

CLEARING 

10% 

01% 

06% 

COMP  TIME 

07% 

05% 

07% 

OTHER 

03% 

11% 

03% 

TOTAL  TIME  ABSENT 

1  hr 

1  hr 

1  hr 

18  vra1n 

02  min 

13  min 
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TABLE  4.  ANALYSIS  OF  BEHAVIORAL  MOOE  DURING  UNIT  &  INDIVIDUAL 
TRAINING  ACTIVITIES  FOR  A  TRAINING  DUTY  DAY 


MODE 

TIME 

t  OF 

TOTAL 

TNG  TIME 

RECEIVING  INSTRUCTION 

25  min 

20 

PERFORMING  TASKS 

57  min 

45 

WAITING  TO 

RECEIVE  INSTRUCTION 

10  min 

08 

ENRQUTE 

23  min 

18 

OTHER 

11  min 

09 

TOTAL 

2  hrs  07  min 

lOOi 

SLIDE  X 


9  —  DISCHARGED 
2  ETS 

2  CHAPTER  13 
2  CHATTER  15 
2  AWOL-DFR 
1  MEDICAL 


—  TO  OTHER  UNITS 
3  OVERSEAS 

2  OTHER  DIVISION 

7  OTHER  BAHAI  ION 

7  OTHER  COMPANY, 
SAME  BATTALION 

1  SPECIAL  SCHOOL 

3  RE-ASSIGNABLE 
OVERSTRENGTH 


Of  the  24X  (32  men)  who  left  the  company: 

23 
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PROBLBiS  IN  MEASURING  TEAM  EFFECTIVENESS 
Albert  L.  Kubala 

Human  Resources  Research  Organisation,^  Fort  Hood,  Texas  7654A 


Background 

Borrowing  heavily  on  characteristics  of  teams  described  by  Glaser, 
Klaus,  and  Egerman,*  as  well  as  Hall  and  Ritxo,  Wagner,  Hlbblte,  Rosen¬ 
blatt,  and  Schulx*  define  team  training  as: 
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The  training  of  two  or  more  Individuals  who 
are  associated  together  In  work  or  activity. 

The  team  is  relatively  rigid  In  structure  and 
communication  pattern.  It  Is  goal-  or  mission- 
orient  ed  with  the  task  of  each  team  master  well- 
defined.  The  functioning  of  the  team  depends 
upon  the  coordinated  participation  of  all  or 
several  Individuals.  The  focus  of  team  training 
and  feedback  1«  on  team  skills  (e.g.,  coordi¬ 
nation),  activities  and  products. 

It  can  be  seen  from  the  Implied  definition  of  a  team,  that  a  team  could  be 
composed  of  anything  from  a  two-man  crr.w  to  «  unit  of  almost  any  sire. 
However,  moat  of  the  literature  dealing  with  teams  has  considered  rela¬ 
tively  small  units  such  as  these  associated  with  one  piece  of  equipment, 
such  es  a  tank  or  aircraft,  or  at  most,  a  platoon  with  s  single  objective 
or  mission.  Taguer,  et  si.  further  point  out  that,  while  the  military 


*Thla  work  was  performed  under  Contract  DAHC19- 75-00025  to  the  ITS 
Army  Research  Institute  for  the  Behavioral  and  Social  Sciences  (ARI). 

Dr,  Chariot  0.  Nystrom  was  the  Contract  Monitor. 

2 

R.  Glaser,  D.  J.  Klaus,  and  K.  Egerman.  Increasing  teem  profi¬ 
ciency  through  training:  2.  The  aoquiviHon  and  extinction  of  a  teem 
response,  Technical  Rapott  AIR  B64-5/62,  American  Institutes  for  Research, 
May  1962. 

3 

VE.  R.  Hall  and  W.  A.  Risxo.  An  assessment  of  US  Navy  tactical  tern 
training:  Focus  on  the  trained  man ,  TAEG  Report  No.  18,  Training  Analysis 
and  Evaluation  Group,  March  1975. 

*H.  Wagner,  N.  Hibblts,  R.  D.  Rosenblatt,  and  R.  Schuls.  Tear,  train¬ 
ing  and  evaluation  etrategiee:  State-of-the-art,  Technical  Rsport  77-1, 
Hunan  Resources  Research  Organisation,  Alexandria,  Virginia,  February 
1977. 
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services  conduct  up  to  90Z  of  their  training  in  the  operetionel  cossunds, 
eost  training  research  has  been  focused  on  individual  training  in  insti¬ 
tutional  settings.  For  example,  in  FT  1974,  the  Amy  Research  Institute 
for  the  Behavioral  and  Social  Sciences  (ARI)  initiated  the  largest  program 
of  unit  training  and  evaluation  research  in  history.  Tet,  only  11?  of  the 
hiawu  resources  budget  was  spent  in  this  area.  Judging  fro*  the  litera¬ 
ture,  the  resources  devoted  to  this  area  by  the  other  nil it ary  services 
has  been  roughly  comparable.  This  lack  of  emphasis  seaaw  strange  in  view 
of  the  fact  that  aoet  fighting  has  been,  and  will  continue  to  be  done  by 
teams.  It  now  seems  critical  that  we  determine  how  well  our  teams  do 
function,  for  as  HG  Gorman  has  stated,  we  must: 
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...train  the  Army  to  win  on  the  first  battle¬ 
field  of  the  next  war  against  an  enemy  that 
rutnuabers  us,  egalnst  an  enemy  whose  weapons 
will  be  as  good  *$  or  nearly  as  good  as  those 
we  possess....^ 

In  other  words,  we  can  ill  afford  any  but  the  aoet  effective  fighting 
teens.  And,  to  ensure  aaxiaua  effectiveness.  Measures  of  Effectiveness 
(MOE)  nust  be  derived  so  that  commanders  can  evaluate  their  own  teens, 
discover  deficiencies,  and  take  corrective  measures. 

Our  HuaRRO  contingent  at  Fort  Hood  became  Involved  in  this  area  when 
we  were  asked  to  determine  what  set  of  MOE  were  currently  being  employed 
to  evaluate  tank  crews,  and  to  determine  what  additional  research  was 
needed  to  ensure  e  comprehensive  evaluation  capability.  We  soon  found 
that  for  all  practical  purposes,  the  only  HOF.  in  current  use  are  scores 
on  Table  VIII,  otherwise  known  as  the  Tank  Crew  Qualification  Course.^ 

For  those  of  you  unfamiliar  with  Table  VIII,  it  should  suffice  for  the 
noaent  to  know  that  it  is  a  live-fire  gunnery  exercise,  where  crews  are 
scored  on  both  hit  accuracy  and  tines  to  engage  targets.  Looking  further 
at  this  MOE,  we  were  surprised  to  find  that  the  reliability  of  Table  VIII 
sccres  has  apparently  never  been  determined,  and  that  nany  question  its 
validity  as  a  predictor  of  combat  effectiveness.  We  wondered  why  no  other 
MOE  were  in  use,  and  why  one  which  was  sonewhat  suspect  was  in  general  use. 
Ve  wondered  what  the  problen(a)  was (were).  Therefore,  we  decided  that  the 


^V.  E.  DuPuy  and  P.  F.  Corman.  "TRADOC  Mission  and  resources  brief¬ 
ing,"  transcript  fron  TV  tape,  US  Army  Training  and  Doctrine  Conaand, 

Fort  Monroe,  Virginia. 

A.  Larson,  W.  K.  Earl,  end  V.  A.  Kensor.  Assessment  of  US  tank 
crew  training,  TCATA  Test  Report  No.  FM  331,  Final  Report  (23  March  75  - 
13  March  76),  HQ,  TRADOC  Combined  Arms  Test  Activity,  Fort  Hood,  Texas, 
July  1976. 
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next  step  should  be  «  study  of  the  problems  sssodsted  with  the  develop-* 
went  end  use  of  tesa  evaluations,  which  is  the  subject  of  this  paper. 

I  doubt  that  anything  I  say  will  really  be  new  to  any  of  you.  h(y 
purpose  in  presenting  this  paper  is  simply  to  re- focus  your  collective 
attention  on  these  probleas.  I  feel  that  the  areas  of  tesn  training  and 
evaluation,  especially  evaluation,  have  been  such  neglected.  Hopefully, 
this  presentation  will  generate  a  case  interest  in  and  lead  some  of  you 
toward,  solutions  for  some  of  the  probleas  T.  will  discuss.  We  have 
painstakingly  developed  procedures  for  building  training  progress  and 
evaluating  Individuals.  We  have  out  inter-service  procedures  for  in¬ 
structional  systeas  development , 7  and  are  now,  in  the  Army,  developing 
individual  Skill  Qualification  Tests  (SQTs).  These  tests  will  be  design' 
ed  to  test  actual  job  perforaance  as  well  as  knowledge,  and  successful 
performance  will  be  a  prerequisite  for  both  retention  and/or  promotion. 
However,  we  have  no  siarilar  procedures  for  either  currlcultat  development 
or  evaluation  of  tesas,  and  they  are  sorely  needed. 


Pro. leas 

The  particular  probleas  which  I  have  chosen  for  further  elaboration 
are  shown  in  the  next  slide. 

SLIDE  3 

#  Defining  Effectiveness 

*  Defining  Teaa  Effectiveness 

*  Probleas  With  Numbers 

•  Reliability 

*  Evaluation  Strategies 

•  Resources 

Defining  effectiveness.  Historically,  MOE  wore  derived  to  ensure  the 
quality  ot  newly  developed  hardware.  For  one  of  our  simplest  weapons— the 
rifle— accuracy  was  the  original  MOE.  Somewhat  later,  rate  of  fire  was 
added  as  an  MOE.  Still  later,  it  was  realized  that  a  highly  accurate 
rapid  fire  weapon  was  of  little  value  unless  it  were  completely  functional. 
Therefore,  the  concept  of  "availability"  came  into  being  as  an  MOE,  and 
was  aeasured  by  such  things  as  Mean  Time  Between  Failure  (HTBF)  and  Mean 
Time  to  Repair  (MTTR).  However,  the  primary  reason  for  the  proliferation 
of  MOE  was  the  recognition  that  effectiveness  was  alas ion-dependent.  For 
example,  the  veapon  characteristics  desirable  /or  a  sniper  rifle  are  quite 
different  from  those  required  for  a  veapon  designed  primarily  for  suppres¬ 
sion.  In  selecting  a  rifle,  a  sniper  would  be  prlaarly  interested  in 

''iRADOC  PAH  3r0-30.  Interservice  procedures  for  instructional  system 
development,  I’S  Army  Training  and  Doctrine  Coostand,  Fort  Monroe,  Virginia, 

1  August  1975. 
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accuracy  and  range,  but  would  not  be  too  concerned  about  rate  of  fire.  j 
Jn  the  other  hand,  the  soldier  with  the  suppression  mission  would  be  very  j 
concerned  with  rate  of  fire,  but  not  too  concerned  with  accuracy. 


An  actual  example  froei  history  serves  to  further  illustrate  the 
problems  In  defining  effectiveness  and  the  necessity  to  consider  the  mis¬ 
sion  in  selecting  NOE.  In  the  early  phases  of  WXX.  a  great  many  British 
merchant  vessels  were  damaged  or  even  destroyed  by  aircraft  attacks.  As 
a  consequence,  merchant  vessels  were  equipped  with  antiaircraft  guns  and 
I  crews.  After  a  period  of  time  it  was  discovered  that  only  4X  of  the 

attacking  enemy  aircraft  were  actually  shot  down.  Ibis  led  some  to  con¬ 
clude  that  the  systems  were  ineffective  on  ships  and  could  be  bettor 
employed  elsewhere,  where  kill  rates  were  higher.  Employing  this  HOE,  the 
\  decision  seemed  inevitable.  However,  further  examination  of  the  data 

revealed  that  the  antiaircraft  fire  greatly  reduced  the  lethality  of  the 
<•  enemy  attack.  In  fact,  the  inclusion  of  antiaircraft  weapons  virtually 

halved  the  probability  that  a  ship  would  be  sunk.  Viewed  in  this  light, 

'i  the  systems  were  considered  highly  effective.  In  other  words,  the  selec- 

l  tion  of  the  wrong  HOE,  or  the  exclusion  of  critical  NOE,  can  lead  to  the 

wrong  decision  about  effectiveness. 


One  further  point  needs  to  be  emphasised.  Training  authorities  and 
evaluators  srs  not  gensrslly  interestsd  In  the  same  klnde  of  NOE  as  hard- 
vsrs  developers.  The  hardware  le  developed  end  I'iclded  long  before  they 
get  into  the  act.  They  must  train  personnel  to  uae  the  equipment  as  it 
is,  and  must  evaluata  ths  effectiveness  of  the  combination  of  the  man  and 
machine  system.  It  matters  little  if  e  bench-fired  weapon  places  100  con¬ 
es  cut  ive  rounds  within  a  6- Inch  circle  at  1000  meters,  if,  s  typical  user 
cannot  hit  e  stationary  enemy  at  50  maters  when  employing  the  weapon. 

When  evaluating  training  or  unit  readiness,  tha  mission  to  be  accomplished 
must  be  considered  and  the  criteria  of  success  must  be  set  realistically 
in  terms  of  the  potential  for  man/machlne  effectiveness.  Unfortunately, 
written  guidance  for  the  evaluator  to  aid  him  in  selecting  or  developing 
M0E  le  nil. 


Defining  team  vffectlveneee.  One  of  the  major  problems  associated 
with  the  evaluation  of  teem  effectiveness  haa  been  the  inability  of  in¬ 
vestigators  to  agrea  on  what  differentiates  team  and  individual  tasks. 

Host  investigators  agree  that  it  la  wasteful  of  effort  to  measure  perfor¬ 
mance  in  a  team  context  when  the  performance  is  actually  nothing  more  than 
an  aggregate  of  individual  performances.  Individual  Job  skills  can  almost 
always  be  measured  more  easily,  coexist ely  end  cost  effectively  through 
lndlvio«r’l  job  performance  tests.  It  la  felt  that  measurement  of  perfor¬ 
mance  in  a  team  context  should  be  reserved  for  only  thoae  task*  which  are 
truly  teem  tasks ;  that  is,  taeke  which  require  cooperation  or  coordination 
to  the  extent  that  eHlle  met  be  practiced  in  a  team  eituatim  in  order 
to  be  op  tiffined. 
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Hi.’ 1  and  Rlrxo  characterised  task*  performed  by  teams  as  bains  in 
either  ’‘established"  or  "emergent"  situations.  In  established  task  situ¬ 
ations,  the  sequence  of  task  performance  and  the  activities  Involved  can 
be  almost  completely  specified.  Also,  the  assignment  of  task  functions 
among  team  members  and  the  equipment  they  operate  are  virtually  fined. 

In  emergent  situations,  decision-making,  problem-solving  and  altering  coma 
to  the  forefront.  The  sequence  of  operations  is  not  fixed,  and  the  allo¬ 
cation  of  functions  is  variable.  Hall  and  Rlsso  essentially  conclude  that 
tasks  performed  in  established  situations  are  not  really  team  tasks. 

Rather,  overall  task  performance  is  simply  the  sum  of  the  performances  of 
the  individual  team  members.  Therefore,  tasks  performed  in  established 
situations  should  not  be  evaluated  in  a  team  context. 

Unfortunately,  in  discussing  various  tasks  with  knowledgeable  people 
in  the  armor  cosaminlty,  I  have  found  little  agreement  as  to  which  tasks 
are  established  and  which  are  emergent.  For  example,  some  have  told  me 
that  firing  on  the  move  IS  definitely  a  team  task.  The  advocates  of  this 
position  point  to  rhe  need  for  precise  timing  between  the  driver,  who  must 
find  a  level  spot  at  exactly  the  right  moment  and  maintain  his  direction, 
and  the  rest  of  the  crew.  Others  feel  that  iiy  accomplished  driver  does 
this  habitually,  and  that  so  long  as  all  craw  pambers  are  individually  com¬ 
petent,  that  the  procedures  employed  eneure  the  proper  conduct  of  the 
engagement.  I  will  not  attempt  to  defend  either  of  these  positions;  I 
mentioned  this  example  only  to  illustrate  the  differences  of  opinion  I  have 
encountered  In  trying  to  differentiate  teem  performances  from  performances 
which  are  merely  an  aggregate  of  lnc‘  vidual  performance*. 

Problem  with  number a.  In  attempting  tc  fully  deacrlbe  the  Job  aitu- 
atlora  of  a  tank  craw  in  gunnery,  Kr earner,  Scldovlci,  and  Boycan®  derived 
a  eet  of  11  claeeea  of  conditions  or  variables  that  could  affect  a  crew's 
capability  to  successfully  engage  targets.  Some  exesEplee  of  these  desses 
end  the  number  of  levels  identified  for  each  class  are  shown  in  the  follow¬ 
ing  elide.  The  term  "levels"  refurs  to  subdaaaes  within  *  main  class.  If 
a  tank  gunnery  objective  were  written  for  ell  possible  combinations  of 
levels,  a  total  of  1,679,616  objective*  would  result.  However,  a  large 
number  of  combinations  ere  unrealistic  (e.g.,  a  moving  bunker)  and  ware 
discarded.  Judicious  combination  of  other  levels  reduced  the  total  number 
of  realistic  combinations  to  the  current  number  of  266.  To  teat  a  crew's 
ability  to  perform  all  of  these  Job  objectives  would  be  time-eonsumlng,  to 
say  the  least,  and  it  rnuet  be  remembered  that  theee  objectives  cover  only 
tank  gunnery.  Obviously,  it  is  not.  feasible  to  measure  Job  proficiency 
on  all  possible  Job  objactivea.  Teats  designed  to  measure  effectiveness 
will  be  able  to  address  only  a  limited  number  of  the  objectives.  However, 
the  need  to  select  a  limited  subset  of  job  objectives  for  testing  is  likely 
to  produce  unfortunate  results.  Training  Is  almost  certain  to  be  concen- 


^R.  E.  Kraemer,  J.  A.  Boldovici,  end  G.  G.  Boycan.  Job 
for  H60A1AGS  tank  gutmery,  ARI  Research  Memorandum  76-9,  Human  Resource* 
Research  Organisation,  April  1976. 
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Condition*  and  L«v*l*  Within  Conditions* 


Condition* 


Level*  Within  Condition* 


Weapon 


Mein  Cun 

Coaxial  Mcchlnegun 
Caliber  .50  Kachlnegtin 


Fire  Delivery  Itethod 


Battleslght  (non-precision  for 
aachlneguns) 

Precision 
Rang*  Card 

Rang*  Card  Lay  to  Direct  Fire 


Firing  Vehicle  Motion 


Stationary 

Moving 


Target  Visibility 


Visible  Without  Artificial  Light 
Visible  With  Artificial  Light 
Not  Visible 


Target  Range 


<500  aeters 
500-900  aeters 
<900  asters 
<1100  aeters 
1100*1600  aeters 
500-3200  aeters 
1100-2300  aeters 
1100-3200  aeters 
ALL 


*Condensed  frost  FIC.  2,. page  2,  R.  E.  Kraeaer,  J.  A.  Boldovici,  and 
G.  G.  boy  can ,  Job  objective*  for  M60A1AOS  tank  gwmery.  All  Research 
Haaorondua  76-9,  Huaan  Resources  Research  Organisation,  April  1976. 
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traced  on  thoee  areas  which  will  be  tested,  to  the  detriment  of  other 
aspects  of  the  job.  This  wight  be  avoided  by  testing  each  crew  on  only 
a  small  sample  of  jobs  from  the  total  job  reals.  If  no  crew  knew  exactly 
which  set  of  items  they  would  receive,  they  could  not  slant  their  train¬ 
ing  to  the  tests.  However,  the  development  of  test  items  for  avery  aspect 
of  the  job  would  be  expensive.  Also,  the  resources  necessary  for  testing 
all  aspects  of  the  job  would  be  extensive.  In  short,  it  appears  that  we 
have  too  many  tasks  and  too  few  resources. 

Reliability.  We  can  only  hope  thxt  our  HOE  are  valid;  that  is,  that 
they  are  indicative  of  how  our  teams  would  perform  in  combat.  However,  we 
usually  can  estimate  their  reliability.  We  were  surprised,  therefore,  to 
find  that  the  reliability  of  Table  VIII  scores  has  apparently  never  been 
determined.  The  only  data  located  which  even  bear  on  the  subject  are  those 
reported  by  Baerman  and  Eaton.  Thwy  found  a  correlation  of  r  •  .68  between 
ratings  of  tank  commander  motivation  and  Table  VIII  scores.  This  would 
indicate  that  the  reliability  of  the  Table  VIII  scores  was  at  least  0.68. 
However,  there  were  several  differences  between  both  the  conduct  end 
the  scoring  procedures  employed  by  Baerman  and  Eaton  and  those  typically 
employed.  A  major  difference  was  that  scoring  of  hits  was  based  on  a 
close-in,  after-the-fact  examination  of  the  targets  rather  than  by  an 
observer  riding  the  tank.  Thaos  investigators  found  early  in  their 
research  that  the  observer  determinations  of  hits  were  subject  to  con¬ 
siderable  error.  Therefore,  had  the  Table  VIII  scores  been  obtained  in 
the  usual  manner,  quite  different  results  might  have  been  obtained.  My 
personal  feeling  is  that  the  teot/retest  reliability  of  Table  VIII  scores 
derived  as  recommended  in  FM  17-12^  would  be  unacceptably  low. 

Stelnhelser  and  Snyder^  pointed  to  another  reliability-related  prob¬ 
lem  with  Table  VIII.  For  example,  assume  that  70X  is  a  pawing  score. 
Further  assume  that  we  test  100  crews  whose  "true"  lev'l  of  functioning 
is  exactly  702.  By  chance,  47  of  these  crews  would  score  less  thsn  702, 
end  therefore  be  mlsclasslfled  as  nonproficient.  Similarly,  212  of  tha 
crews  whose  true  level  of  functioning  was  only  602  would,  by  chance,  be 
mlaclaaslfled  aa  proficient.  Errors  of  mlsclsssif icetlon  could  be  reduced 
by  Increasing  the  length  of  the  teet  to  Improve  its  reliability.  However, 
increasing  the  length  would  also  increase  the  resource  requirements,  and 
resources  are  extremely  scarce  at  this  point  in  our  history. 

^V.  P.  Beaman  and  N.  K.  Eaton.  "Crew  assignment  and  training,"  Annor, 
January-February  1977,  50-53. 

17-12.  Tcnk  gvrmery,  HQ,  Department  of  the  Arty,  Washington, 

D.C.,  March  197  '. 

iJF.  Stelnhelser,  Jr.,  and  C.  W.  Snyder,  Jr.  "Score  quality  laauea 
related  to  individual  and  weapon  crew  criterion-referenced  performance 
tests,"  presented  at  the  Military  Testing  Association  Conference,  October 
1976. 
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To  recapitulate,  our  evaluations  of  tank  crews  are  currently  based 
almost  entirely  on  performance  in  Table  VIII.  Yet.  Table  VIII  scores 
are  of  unknown  but  questionable  reliability.  Because  of  this  nearly  total 
reliance  on  Table  VIII.  it  is  Imperative  that  its  reliability  be  deter** 
mined,  and  that  every  attempt  be  made  to  improve  its  reliability,  either 
by  changes  in  scoring  procedures  or  modifications  to  the  conduct  of  the 
test.  However,  to  date,  I  have  been  unable  to  obtain  the  necessary  sup- 
port  to  conduct  a  reliability  study. 

I  have  not  closely  examined  specific  team  evaluation  procedures  in 
any  other  context.  Therefore,  I  have  no  idea  whether  other  branches  in 
the  Army  or  other  military  services  face  similar  problems,  but  I  strongly 
suspect  that  they  do. 

Testing  strategies.  TVo  principal  Issues  divide  evaluators  in  their 
approaches  to  testing.  These  are  the  employment  of  (a)  one-  vs.  two-sided 
test  situations,  and  (b)  process  vs.  outcome  measurements. 

One-sided  vs.  two-sided  tests.  In  a  one-sided  test,  such  as 
Table  VIII,  the  exaatlneas  face  a  relatively  structured  situation  In  which 
the  sequence  of  events  is  relatively  fixed.  "Aggressor"  forces,  if  pres¬ 
ent  at  all,  are  restricted  to  specific  preplanned  activities.  In  a  two- 
sided  test,  aggressor  forces  must  be  present  and  typically  have  few 
limitations  placed  on  their  activities.  The  advocates  of  two-sided  exer¬ 
cises  stress  the  importance  of  realism,  the  opportunities  for  real-time 
decision-making,  and  the  morale- boos ting  aspects  of  competition.  They 
also  point  out  that  the  inflexibility  of  one-sided  tests  <eakes  them  easy 
to  train  and  practice  for.  Therefore,  they  feel  such  tests  provide  only 
poor  indications  of  how  the  participants  would  actually  perform  in  combat. 

Those  favoring  the  one-sided  approach  to  evaluation  point  to  the 
fact  that  repetition  of  the  Identical  circumstances  is  virtually  impos¬ 
sible  in  a  two-sided  test.  Therefore,  no  two  individuals  or  teams  receive 
exactly  the  same  test,  making  it  impossible  to  set  exact  performance 
s tender ca  or  to  compare  the  performance  of  any  two  teams.  I  should  point 
out  that  choosing  the  type  of  teat  is  not  always  a  problem,  for  the  type 
of  data  required  frequently  determine  the  most  suitable  type.  For  example, 
if  exact  times  are  needed,  such  as  the  time  to  fire  after  llne-of-slght  to 
a  target  is  achieved,  a  one-sided  test  should  be  employed.  Knowledge  of 
the  exact  moment  the  target  appeared  would  be  virtually  impossible  in  a 
two-sided  test.  One-sided  tests  are  also  generally  necessary  if  live-fire 
is  required. 

Two-sided  exercises  are  considered  essential  when  targets  must  be 
generated.  For  example,  e  two-sided  exercise  would  be  necessary  if  the 
HOE  were  to  be  the  ratio  of  friendly  to  threat  casualties. 

Process  vs.  outcome  measurements.  Stated  very  elmplistlcally , 
'process"  measurements  are  concerned  with  an  evaluation  of  all  of  the 
actions  taken  during  an  engagement,  but  ere  not  particularly  concerned  with 
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the  final  outcome.  "Outcome"  measurements  are  cot  concerned  with  the  pro** 
cedurea  Involved  or  the  progrees  of  the  engagement,  but  only  In  who  vine 
and  who  loaea . 

Oeborn^  le  an  advocate  of  process  measurement.  Be  feels  that  to  be 
useful,  a  test  must  be  diagnostic.  That  Is,  It  Must  provide  information 
on  exactly  why  a  particular  aspect  of  performance  was  successful  or  un¬ 
successful.  Bammell,  Casteyer,  and  Pesch^  state  the  case  for  process 
evaluations  In  dlscussslng  Advanced  Officer  (AO)  tactics  training  as  shown 
In  the  next  slide.  In  other  words,  P.s— til,  et  al.  feel  that  process  Is 
the  only  Important  aspect  of  performance  In  training  evaluations.  A  good 
decision  or  action  may  lead  to  s  poor  outcome,  but  the  decision  or  action 
should  be  evaluated  on  Its  own  merits,  end  not  on  the  vagaries  of  future 
actions  by  an  unpredictable  enesiy. 

SLIDE  5 

...numerous  alternative  sequences  of  actions  may 
exist,  many  of  which  may  be  equally  plausible  for 
attaining  e  specific  objective.  The  sequence  of 
actions  employed  by  the  AO  contains  a  complex 
series  of  evaluations  and  action  selections  which 
are  situation  Intended.  The  attainment  of  the 
ultimate  objective  way  often  be  irrelevant  to  the 
evaluation  of  the  AO's  performance.  Thie  hit  or 
misa  philosophy,  although  distinctly  meaningful 
in  the  operational  emrtronment,  is  inadequate  in 
the  training  situation. 

The  case  for  outcome  measurements  car.  be  stated  rather  a Imply.  In 
an  operational  environment,  commanders  are  nor*  Interested  in  friendly/ 
enemy  lose  ratios,  resources  expended,  end  territory  won  or  lost.  The 
attainment  of  some  set  of  predetermined  mission-oriented  goals  along  these 
dimensions  is  e  much  more  meaningful  measure  of  effectiveness  to  the  field 
coanander. 


j2K.  C.  Oaborn.  Process  versus  product  measures  in  performance  tent¬ 
ing,  Professional  Paper  16-74,  Hunan  Resources  Research  Organisation, 
Alexandria,  Virginia,  October  1974.  (Based  on  paper  for  Military  Testing 
Association  Meeting,  Ssn  Antonio,  Texas,  October  1973)) 

^ T .  J.  Hemmell ,  C.  E.  Casteyer,  end  A.  J.  Peach.  Advanced  Officer 
tactics  training  device  Meeds  and  performance  measurement  technique  ~ 
Volme  I,  TR : HAVTRAEQUIPC®  72-C-0033-1,  General  Dynamics  Corporation, 
Electric  Boat  Division,  Groton,  Connecticut,  November  1973. 

14Ihid. 

^Italics  added  by  author. 


Perhaps  you  are  wondering  why  I  bring  up  theae  strategies  in  a  paper 
dealing  with  problem.  The  situation  as  I  see  it  is  this:  We  need  pro¬ 
cess  evaluations  for  feedback  to  training  managers,  and  we  need  outcome 
evaluations  to  met  the  needs  of  field  commanders.  Yet,  it  is  difficult 
to  obtain  process  information  from  a  two-sided  test  and  even  more  difficult 
to  obtain  outcoae  Information  of  the  kind  desired  by  commanders  from  a 
one-sided  test.  It  is  difficult  enough  to  obtain  resources  for  even  one 
type  of  test,  much  less  two.  The  problem  is  in  finding  a  way  to  combine 
the  best  features  of  both  types  of  tests  without  undue  expenditure  of 
scarce  resources. 


Resources .  X  have  already  motioned  the  resource  problem  in  passing 
several  time.  The  military  services  are  experiencing  one  of  the  longest 
and  most  severe  periods  of  austerity  in  their  recent  history.  Yet,  as  has 
been  pointed  out,  adequate  evaluations  are  quite  demanding  of  resources. 

In  leas  au&tere  time,  Baker  and  Cook^  painstakingly  constructed  a  "Tank 
Platoon  Combat  Readiness  Check."  The  final  checklist,  including  instruc¬ 
tion  to  the  examiner,  was  approximately  90  typewritten  pages  in  length. 

The  authors  also  pointed  out  that  the  entire  evaluation  took  approximately 
3G  hours  to  administer  and  required  the  use  of  "aggressor"  forces.  At 
the  present  time,  most  commanders  would  consider  the  resources  required 
for  routine  conduct  of  such  sn  evaluation  to  be  out  of  the  question. 


It  seem  obvious  that  we  cannot  develop  adequate  evaluation  tech¬ 
niques  for  team  performance  unless  additional  resourcea  can  be  found. 

While  such  is  unlikely  an  an  absolute  sense,  the  possibility  of  con¬ 
serving  resources  for  evaluations  offers  some  hope.  Simulation  techniques, 
for  example,  are  being  employed  for  training  with  increasing  frequency  and 
with  little  apparent  lose  in  training  effectiveness.  For  example.  Powers, 
McCluakey,  and  Haggard^  trained  four  : ~oup*  of  tank  gunners  employing 
100X,  66X,  33X,  and  OX  live-fire.  There  vere  no  differences  between  the 
hit  percentages  of  the  four  groups  in  a  live-fire  posttraining  teat. 
Therefore,  it  appears  that  considerable  ammunition  could  have  been  saved 
vith  no  loss  in  training  effectiveness. 


Whether  through  the  use  of  simulation  or  by  other  mans,  It  la  our 
opinion  that  the  problem  is  not  whether  we  expend  the  resourcea,  but 
rather,  how  we  obtain  the  necessary  resources.  As  HG  Corman  has  stated, 
we  must  be  prepared  to  fight  outnumbered  against  an  enemy  whose  weaponry 


^R.  A.  Baker  and  J.  C.  Cook.  The  development  and  evaluation  of  th * 
tank  platoon  combat  readiness  oheok,  Research  Memorandum,  Human  Resourcaa 
Research  Organisation,  Alaxandrla,  Virginia,  April  1963. 

*?T.  R.  Powers,  M.  R.  McCluskey,  and  D.  P.  Haggard.  Determination  of 
the  contribution  of  live  firing  to  weapons  proficiency ,  Pinal  Report  FR- 
CD(C)-75-l,  Humtn  Resources  Research  Organisation,  Alexandria,  Virginia, 
March  1975. 
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»fill  be  virtually  equal  to  oure.  To  do  eo,  we  amt  be  able  to  accurately 
evaluate  our  fighting  teaas,  and  take  corrective  actions  to  eliminate  any 
deficiencies. 
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TESTING  FOR  COORDINATION  IN  SMULL  UNITS 

Clay  E.  George 
Texas  Tech  University 


One  purpose  of  this  paper  is  to  demonstrate  the  need  for  formal 
testing  procedures  to  determine  unit  member  response  coordina.ion  in 
small  military  units.  I  will  attempt  to  show  that  "response  coordina¬ 
tion"  is  a  major  determinant  of  small  uni't  efficiency  and  that  it  is 
neither  a  "state"  nor  a  "trait"  variable.  It  appears  to  be  a  charac¬ 
teristic  which  an  individual  exhibits  consistently  within  a  given  so¬ 
cial-behavioral  task  setting  but  which  may  vary  greatly  across  such 
settings. 

A  second  purpose  is  to  suggest  approaches  to  the  measurement  of 
coordination  in  various  types  of  military  units.  Research  efforts  over 
the  past  20  years  have  made  it  possible  to  specify  precise  measurement 
operations  for  the  rifle  squad.  They  have  also  produced  guidelines  for 
ectab  1  isMng  such  operations  relative  to  other  types  of  units. 

A  model  of  unit  performance  (Figure  1) 

i 

My  ana  lysis  of  the  determinants  of  performance  in  mission  oriented 
groups  is  probably  very  much  like  any  one  else's.  A  group  can  not  per¬ 
form  wel  1  if  its  members  are  not  competent.  The  contributions  of  psy¬ 
chologists  and  others  have  been  very  Rrent  in  establishing  effective 
procedures  for  individual  selection  and  training  in  military  environ- 
(mints.  In  effect,  we  have  some  rather  impressive  technologies  for  en¬ 
suring  that  our  units  are  composed  of  proficient  individuals,  within 
the  limits  of  available  human  resources. 

Higher  level  leadership  and  "organizational  climate"  also  affect 
unit  performance.  I  think  that  the  ideas  presented  hern  are  also  rele¬ 
vant  to  this  topic  (Klein,  Gibbs,  George,  Pruitt  6  Patrizi,  1977)  but  my 
central  focus  Is  on  the  lower  level  unit. 

A  number  of  "morale"  variables  have  been  shown  to  correlate  with 
unit  performance,  especially  in  stressful  circumstances  (e.g.  Dudek, 

George  6  Ayoub,  1969;  George,  1965;  George,  1970  and  Hodge,  1972).  The 
weight  of  the  evidence,  however,  seems  to  indicate  that  these  morale 
variables  are  often  symptoms  or  effects  rather  than  “can.  's"  of  perfor¬ 
mance. 

The  primary  concern  here  is  with  delineating,  observing  and  count¬ 
ing  specific  coordinative  responses  available  to  unit  menbers.  To  fur- 


487 


ther  this  objective,  type  of  unit  Is  roughly  categorized  in  a  second 
model. 


Model_ 

It  has  been  necessary  to  take  into  account  both  the  degree  of  group 
structure  and  the  flexibility  of  that  structure  to  explain  the  findings 
from  my  research  program.  Degree  of  structure  is  measurable  in  several 
ways  (George,  1962).  Perhaps  the  simplest  way  for  present  purposes,  is 
to  use  the  ratio  between  the  number  of  role  specialties  and  ne  number 
of  group  members.  If  each  unit  member  has  a  unique  specialty  within  the 
group,  that  group  is  completely  ( 1 00 X )  structured.  If  every  unit  member 
could  have  exactly  the  same  role  specialty  (no  leadership,  the  group 
would  be  completely  unstructured.  Military  units  tend  to  be  relatively 
highly  structured.  We  are,  therefore,  primarily  concerned  with  the 
higher  end  of  this  dimension. 

Flexibility  of  structure  is  measured  (or  estimated)  by  the  proba¬ 
bility  of  role  interchange  in  the  operational  (not  necessarily  training) 
environment.  The  rifleman  in  an  infantry  squad  lias  a  very  high  proba¬ 
bility  of  being  required  to  take  over  some  (or  all)  of  the  roles  of  a 
grenadier  or  team  leader  during  operations.  A  steward  on  a  MATs  flight, 
on  the  other  hand,  has  a  low  probability  of  taking  over  the  pilot's  role 
successfully.  Highly  but  flexibly  structured  small  units  will  be  called 
teams.  These  include  army  and  Marine  Corps  infantry  squads  and  Navy  and 
Air  Force  advanced  base  security  units,  among  others.  Crews  are  highly, 
but  less  flexibly  structured  unit3  such  ns  aircraft  and  tank  crews.  The 
team-crew  distinction  is  a  useful  aid  to  the  recognition  and  measurement 
of  Intraunit  response  coordinat ion . 


Measurement  Appro.i 


Symptomatic  variables.  Cohesion/status  is  perhaps  the  most  thorough¬ 
ly  studied  of  this  class  of  variables.  A  common  measurement  process  in¬ 
volves  rankings  or  ratings  by  each  group  member  of  the  respect  or  affec¬ 
tion  they  hold  for  each  ether  member.  This  measurable  aspect  of  units 
docs  tend  to  correlate  with  performance  and  it  probably  helps  to  protect 
performance  levels  from  the  deleterious  effects  of  stress  (George,  1962, 
1965,  196  7,  J 9 70 ) .  The  validity  exhibited  t>y  cohesion  as  measured  in 
research  settings,  however,  tends  to  wane  if  used  administratively  in 
ways  that  might  produce  contingencies  for  the  group  or  tor  any  member(s) 
thereof.  The  use  of  peer  ratings  or  rankings  in  officer  candidate  pro¬ 
grams  produced  fairly  good  prediction  of  combat  performance  in  WW  II,  a 
lower  level  of  prediction  during  the  Korean  conflict  (cf.  Jennings,  Rose 
6  Kreug,  1974)  and  a  negative  relationship  with  leadership  knowledge  and 
skills  during  the  Vietnam  era.  The  latter  finding  is  based  on  unpublished 
research  undertaken  by  myself  and  others  as  a  part  of  technical  advisory 
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services  during  the  late  1960s.  Any  administrative  use  of  this  sort  of 
measurement  process  should  be  undertaken  with  great  caution. 

A  second  set  of  systematic  variables  results  from  Carter's  (1955) 
derivation  of  three  factors  describing  the  behaviors  of  persons  in  small 
groups.  Inventory  measures  have  been  reported  by  Bass  (1962)  and  by 
George  (1967).  These  scales  indicate  the  person's  tendency  in  groups  to 
maximize:  (1)  personal  achievement  (intragroup  conpetition) ,  (2)  social¬ 
izing  and/or  (3)  unit  efficiency  (coordination).  Unit  member  scores  on 
the  motive  to  coordinate  do  predict  unit  performance  on  tasks  requiring 
coordination,  but  those  who  start  out  low  on  this  characteristic  show 
significant  increases  when  their  unit  is  reinforced  for  improved  perfor¬ 
mance  (George,  Hoak  6  Boutwell,  1963;  George,  1967). 

Each  of  the  symptomatic  variables  is  useful  as  a  research  tool  when 
looking  for  behavioral  measures  of  unit  functioning.  Using  these  vari¬ 
ables  as  primary  predictors  or  indicants  of  unit  characteristics  is  a 
complex  and  uncertain  procedure,  however.  One  Is  well  advised  to  pro¬ 
ceed  with  caution. 

Behavioral  response  coordination 

At  the  most  general  level,  coordination  is  conposed  of  recognizing 
and  acting  upon  a  unit's  need  without  specific  direction  or  instruction 
to  do  so  (Figure  4).  This  level  may  be  Illustrated  by  n  study  of  51 
Army  ROTC  sophomore  cadets  working  in  8  "lender less"  groups  with  4-9  mem¬ 
bers  each  (George,  Simms  6  Lumpkin,  1969;  ("merge,  Simms,  Dcardorff  &  Hafcr, 
1969;  George  6  Dtidok,  1974).  The  groups  were  given  the  task  of  assembling, 
from  a  stack  of  parts,  one  fewer  rifles  than  the  group  had  members.  Those 
who  quickly  assembled  a  rifle,  or  who  had  no  parts  to  work  with,  could 
choose  to  help  others  or  to  stand  idle  during  the  work  period.  One  ob¬ 
server  per  cadet  tallied  each  unrequested,  spontaneous  action  (sugges¬ 
tion,  direct  aid,  etc.).  After  the  task  was  completed,  each  cadet  wrote 
a  critical  incident  report  on  at  least  one  other  cadet  whose  behavior 
had  effected  the  grovp's  performance.  Spontaneous  coordination  (initia¬ 
tive)  correlated  .47  with  number  of  positive  critical  incidents  credited 
by  peers,  -.37  with  negative  critical  incidents  and  .45  with  global  peer 
leadership  evaluations  taken  approximately  6  weeks  late,.  Each  of  these 
correlations  vtu  significant  at  the  .05  level.  Grade  point  average  had 
an  insignificant  correlation  of  .04  with  coordination.  The  general  pat¬ 
tern  of  results  indicates  that,  even  in  low  structure  groups,  one  can 
measure  coordination  with  some  degree  of  consensual  validity. 

Experiments  with  intact  fire  teams  and  rifle  squads  (highly  struc¬ 
tured  units  with  missions  requiring  flexibility  of  structure)  also  show 
that  observers  can  covnt  coordinativc  responses  (George,  1967,  1970). 
Coordination  responses  in  fire  teams  acting  as  base  of  fire  elements  in- 
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eluded  Increasing  sector  of  fire  when  another  could  not  cover  his  sector, 
redisti-ibutir  ?  assstmition  as  required,  taking  the  position  of  lender  or 
autocratic  rifleman  when  casualties  occurred,  etc.  This  class  of  responses 
increased  over  training  trials  (each  trial  facing  the  team  with  different 
emergencies  requiring  different  specific  responses  from  the  various  peo¬ 
ple  involved)  from  21%  of  the  needed  responses  on  trial  1  to  65%  on  trial 
4  (p  <  .05).  Over  the  entire  problem,  coordination  scores  correlated 
,72  with  a  criterion  fire  distribution  score  and  .56  with  a  criterion 
fire  volume  score.  Both  correlations  were  significant  at  the  .05  level. 

In  team  situations,  response  coordination  can  be  measured,  it  can  be  in¬ 
creased  by  appropriate  training  and  it  can  be  shown  to  inprove  perfor¬ 
mance  on  criteria  of  military  importance. 

The  specification  and  measurement  of  spontaneous  intracrew  coordina¬ 
tion  is,  unfortunately ,  less  well  established.  It  is  true,  of  course, 
that  crew  coordination  can  more  often  be  achieved  by  leader  direction 
and  control  or  by  machine  provided  cues  than  is  the  case  of  teams 
(George,  1970;  Miller,  1971).  Still,  crews  with  military  missions  will 
face  situations  requiring  spontaneous  coordination  and  outcomes  may  be 
catastrophic  in  its  absence.  Task  relevant  communications  are  most  often 
suggested  ns  appropriate  measures  for  crew  coordination  (Brown,  1977; 
McRae,  1966).  A  major  problem  is  that  efficient  crews  operate  with  an 
absolute  minimum  of  conmunicat ion  of  any  kind.  There  are  Indications  in 
the  literature  that  number  of  task  relevant  communications  may  correlate 
positi'-ely  with  performance  given  tasks  of  sufficient  difficulty  (Figure 
5). 


McRae  (1966)  studied  12  four-man  crews  of  soldiers  solving  problems 
of  increasing  difficulty  over  three  trials,  after  partiallng  out  solu¬ 
tion  time,  significant  correlations  of  .68,  .76  and  ,88  were  reported  be¬ 
tween  task  relevant  coramuni cat  ions  and  response  accuracy.  The  more  diffi¬ 
cult  the  problem,  the  more  cowoun lent  ion  uerved  a  coordlnative  function 
to  enhance  performance.  Although  this  explanation  is  not  a  compelling 
one  from  McRae's  data  there  are  additional  experiments  to  support  it. 

Brown  (1977)  studied  12  four-woman  tean»  working  problems  under  each 
of  four  conditions  of  difficulty  level.  Time  was  held  constant.  Corre¬ 
lations  between  task  relevant  communication  (requesting  information)  and 
number  of  correct  solutions  was  -.89  for  crews  working  on  the  easiest 
problems  and  -.31,  -.16  and  .19  tor  those  working  on  progressively  more 
difficult  problems.  Although  only  the  largest  of  these  correlations  is 
significant,  it  does  appear  that  sufficiently  difficult  tasks  may  force 
crew  members  to  communicate  when,  and  only  when,  task  requirements  demand 
such  coordinating  responses. 

George,  Keating,  l.un^kin  and  Miller  (1971)  reported  that  five-man 
crews  could  perform  well  with  very  limited  communication  allowed  to  them 
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provided  that  they  had  been  given  an  earlier  training  trial  with  unre¬ 
stricted  communication  opportunities.  Conversely,  teams  trained  under 
severely  limited  comamsiication  conditions  failed  to  perform  well  on 
even  the  second  of  two  transfer  trials  conducted  with  unrestricted  oppor¬ 
tunity  to  communicate.  Perhaps  we  wilJ  have  to  learn  the  specifics  of 
coordination  by  studying  crews  under  very  difficult  task  conditions. 

Summary 


Learning  to  measure  coordination  in  the  rifle  squad  has  provided  a 
general  approach  (Figure  3)  which  should  be  applicable  to  military  teams 
in  general.  Indications  are  that  a  research  effort  similar  to  that  used 
with  teams  will  produce  equally  good  measures  of  crew  performance;  that 
is,  study  ctew  performance  under  very  difficult  citcunfetanceB  (Figure  5). 
Small  «tnlt  evaluation  is  believed  to  be  an  application  of  behavioral  sci¬ 
ence  which  could  be  of.  major  value  to  the  military  services.  Such  mea¬ 
surement  should  greatly  effect  unit  training  programs  and  readiness  eval- 
uat 1  on. 
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Figure  1.  A  model  of  small  unit  fund  1  oni ng. 


DEGREE  OF  STRUCUTRE 
(Roles:  Persons  ratio) 


High 


(1:1) 


CREWS 

.  marked  common  fate 
.  spatial  closeness  of  members 

.  cues  from  co-workers  and  machines 
via  Rood  comrauni cat  ions  net 

.  command  and  control  relatively 
easy 

H - 

0 


TEAMS 

.  cross  training  essential 

.  operational  communications 
typically  poor 

.  spatial  distance  between  raem- 
be  rs 

.  cues  from  external  environ¬ 
ment,  co-workers,  machines 

.  command  and  control  extremely 
difficult 


100 


i 

Kl.KXIBIl.ITY  OK  STRICTURE  *  I 


LAJOR  GANGS 


SEMINARS 


( 1  :n) 


Low 


♦Probability  of  role  interchanite  forced  by  uncont  rol  labl*'  events. 
Figure  2.  Model  of  small  unit  structural  characteristics. 


I.  Symptomatic  variables  (individual  and  group  characteristics  within 
the  unit-task-setting  environment) 

A.  Sociometric  (questionable  administrative  utility) 

1.  affection  (stress  resistance) 

2.  respect  (mutual  confidence) 

5.  Unit  member  motivation  to  maximize: 

1.  personal  achievement  (intragroup  competitive) 

2.  socializing  (emotional  support) 

3.  unit  efficiency  (coordination) 

II.  Behavioral  coordination  of  response 

A.  Shared  attention  among: 

1.  one's  primary  Job 

2.  status  of  co-workers 

3.  machine (s)  in  the  unit  system 

4.  extra- unit  task  environment 

B.  Recognition  of  initiative  taking  requirement 

C.  Respond  to  requirement 

1.  individual,  immediate  action 

2.  communicate  status  to  other(s) 


Figure  3.  Small  unit  level  correlates  of  performance. 
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form  others 

Poor  fire  volume 
and  distribution 

Same  as  above 

Loss  of  contact 

Regain  by  move¬ 
ment,  voice, 
visual  search 

Decreased  effect¬ 
iveness 

Fear,  retrain, 
poor  visibili¬ 
ty 

Inadequate 
dl»r  trtti on 

Moving,  getting 
others  to  move, 
encouraging 

Unnecessary  expo¬ 
sure  and  loss 

Heavy  or  sur¬ 
prise  fires 

Going  to  ground 

Initiate  fire  and 
movement  even  with¬ 
out  orders 

Destruction  of 
unit 

Personnel  turn¬ 
over 

Receipt  of  re¬ 
placement 

Accepting,  support¬ 
ing  emotionai ly , 
t  rni nlng 

Skill  dilution, 
lost  cohesion 

Combat  stresses 

Indecisive  behav¬ 
iors 

Suggesting,  encour¬ 
aging,  correcting 

Loss  of  unit 
initiative,  drive 

Figure  4.  Operat ionnl  conditions  lending  to  coordination  requirement a  by 
tean  members . 

Source:  Small  unit  combat  after  action  reports  in  Infantry 
School  library.  Fort  Benning,  C>p. 
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(From  McRae,  1966) 
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Independent  groups,  solv¬ 
ing  problems  of  increasing 
difficulty 


Replications  (trials) 
of  Increasing  diffi¬ 
culty 


Figure  5,  Task  oriented  conmunl cat  1 ons  in  crews  as  correlates  of  per¬ 
formance  mediated  by  task  difficulty. 


A  CULTURE-FREE  PERFORMANCE  TEST  OF  LEARNING  APTITUDE1 

James  K.  Arima 
Naval  Postgraduate  School 

From  World  War  I  to  the  late  1950s,  standardized  mental  tests  with 
nationally  based  norms  became  widely  used  for  selection,  placement,  and 
classification  decisions.  Their  great  acceptance  was  due,  in  large  part, 
to  their  role  in  furthering  the  American  concept  of  an  egalitarian  society 
(Holzraan,  1971).  That  is,  decisions  of  considerable  importance  to  indi¬ 
viduals  could  be  made  on  the  basis  of  merit,  given  a  person's  score  on 
an  objective  test  of  ability  with  the  requisite  reliability  and  validity. 

The  Amwd  Services  were  leaders  in  the  testing  movement,  and  the 
use  of  the  A nry  Alpha  and  Beta  tests  in  World  War  I  has  been  identified 
with  the  beginning  of  the  testing  movement  in  which  large  numbers  of 
persons  are  routinely  tested  for  selection  and  placement.  Nearly  two 
million  people  were  given  the  tests  during  the  course  of  the  war,  and 
the  results  provided  much  of  the  information  for  later  studies  of  demo¬ 
graphic,  socioeconomic,  and  cultural  differences  in  intelligence  and 
ability  (Matarazzo,  1972).  World  War  II  saw  a  similar  emphasis  on  mass 
testing  and  the  development  of  the  Amy  General  Classification  Test 
(Melton,  1957).  Again,  the  results  of  the  testing  program  provided  large 
amounts  of  valuable  information  for  scientific  study  that  went  far  beyond 
the  limited  purposes  for  which  tests  were  originally  administered.  Even¬ 
tually,  the  AGCT  was  made  available  in  commercial  form  for  sale  to 
qualified  users  in  the  general  public. 

In  the  post-World  War  II  years,  the  Armed  Forces  Qualification 
Test  (AFQT)  with  a  scoring  In  readily  understandable  percentiles  became 
the  standard,  general  test  of  mental  ability  for  the  services.  The  AFQT 
designation  of  mental  categories  is  still  In  use  today.  Throughout 
these  developments,  sepclal -purpose  tests  were  also  being  created  by  the 
Individual  services  until  a  common  entrance  test  was  no  longer  the  rule 
wit «  the  advent  of  the  All  Volunteer  Force  (Melton,  1957;  Windle  and 
Val lance,  1964).  More  recently,  however,  an  emphasis  on  efficiency  In 
the  testing  program  on  the  part  of  Congress  and  the  Defense  Secretariat 
has  seen  the  emergence  of  the  Armed  Services  Vocational  Aptitude  Battery 
(ASVAE)  as  a  common  test  of  general  aptitude  for  military  service.  A 
form  of  the  ASVAB  Is  also  used  in  civilian,  secondary  schools  in  the 
High  School  Testing  Program  managed  by  the  Armed  Forces  Vocational 
Testing  Group  (AFVTG). 


I  am  Indebted  to  Peter  A.  Young  for  running  the  subjects  and  collect¬ 
ing  and  analyzing  the  data  as  a  part  of  his  master's  thesis  (Young, 
1975).  Paul  Sparks  created  the  Instrumentation  for  the  experimental 
administration  of  the  test.  The  terms  culture-free  and  cu1ture-fa*r 
will  be  used  to  mean  the  same  thing. 


The  growth  and  apparent  success  of  the  testing  movement  has  not 
been  without  its  critics  and  detractors.  The  criticism  did  not  reach 
social  significance  until  the  middle  and  late  sixties  when  many  of  our 
institutions  were  put  to  severe  test  with  a  reexamination  of  our  value 
systems  and  the  emergence  of  new  concepts  for  improving  the  quality  of 
life  in  America.  The  routine  testing  of  job  applicants  took  a  severe 
setback  in  the  Griggs  et  al.  vs.  Duke  Power  Company  decision  of  the 
U.S.  Supreme  Court  when  it  ruled  that  a  test  could  not  be  used  as  a 
selection  device  unless  the  measured  abilities  represented  by  the  scores 
on  the  test  were  shown  to  be  required  for  acceptable  performance  on  the 
job.  This  decision  had  at  least  two  implications  for  testing.  One,  ob¬ 
viously,  related  to  the  traditional  concept  of  the  predictive  validity 
of  tests,  and  the  other  was  with  respect  to  the  use  made  of  tests. 

Regarding  the  predictive  validity  of  tests,  the  court's  decision 
was  quite  telling,  since  most  tests  predict  intermediate  criteria  we i 1  — 
such  as  normatlvely  scored  achievement  tests--but  not  more  distant, 
more  ultimate  criteria,  such  as  occupational  success  (Goslln,  1968). 

This  situation  is  particularly  prevalent  in  such  large  Institutions  as 
the  military  (Thomas,  1972a,  1972b)  and  the  nation's  educational  systems. 
The  question  of  the  use,  or  misuse,  of  tests  focuses  on  the  results 
that  testing  programs  produce.  Trie  argument  has  been  that  differential 
prediction  or  classification  of  Individuals  results  when  they  are  cate¬ 
gorized  on  the  basis  of  ethnic  and  socioeconomic  backgrounds.  Broadly 
stated,  differential  prediction  means  that  the  proportion  of  Individuals 
who,  for  example,  pass  a  selection  cutoff  score  is  not  the  same  for  the 
different  categorical  groups.  Such  differential  prediction  has  been 
labeled  bias  because  culturally  deprived  persons  have  not  had  the  oppor¬ 
tunity  to  master  the  material  content  of  the  tests  nor  to  develop  the 
test-taking  activation,  experience,  and  specific  skills  of  other  groups 
of  persons  (Goslln,  1968).  The  bias  is  usually  attributed  to  the  test, 
rather  than  to  the  uses  made  of  the  test,  but  the  argument  is  not  en¬ 
tirely  convincing  (Green,  1975).  Even  on  a  strictly  psychometric  basis, 
several  different  definitions  of  bias  arc-  possible  (Hunter,  Schmidt,  and 
Rauschenberger,  1977). 

While  the  Armed  Services  have  managed  to  escape  severe  criticism 
In  the  past,  there  are  signs  that  the  situation  Is  changing.  The  use 
of  the  ASVAB  In  the  High  School  Testing  Program  recently  received  very 
sharp  criticism  from  Lee  J.  Cronbach,  and  the  Office  of  Management  and 
Budget  (0M8)  has  Instituted  a  series  of  Inquiries  into  the  management 
of  their  testing  programs  on  the  part  of  the  several  services. 

Complicating  the  issues  of  test  validity  and  test  usage  as  sources 
of  bias  Is  the  argument  with  respect  to  the  roles  of  heredity  and  environ¬ 
ment  In  the  determination  of  a  measured,  mental  ability--such  as 
intelligence.  If,  as  argued  by  Jensen  (1968a),  heredity  plays  the 
predominant  role  by  a  margin  of  as  much  as  2-to-l,  then  the  cultural 
deprivation  argument  loses  considerable  weight.  That  is,  che  important 
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differences  exist,  more  or  less,  independent  of  environmental  factors. 

On  the  other  hand,  if  it  is  argued  that  the  range  of  performance  capa¬ 
bilities  at  a  fixed  hereditary  level  is  b‘X>ad  and  essentially  unpre¬ 
dictable  due  to  the  Influence  of  many  environomental  factors  (Feldman 
and  Lewontin,  1975),  then  the  role  of  cultural  and  socioeconomic 
factors  In  causing  the  differential  prediction  of  testing  programs  must 
be  acknowledged  and  corrected.  A  deceptively  simple  solution  would  be 
to  create  tests  that  are  culture  free.  Presumably,  a  culture- free 
test  would  be  measuring  the  "real"  or  hereditary  potential— the  genotype- 
of  the  person  being  tested.  But,  if  an  operational  definition  of  an 
unbiased,  culture- free  test  Is  that  all  categories  of  cultural  groups 
have  the  same  mean  and  distribution  function  on  the  test,  the  use  of  such 
a  test  for  selection  is  highly  likely  to  result  in  differential  outcomes 
on  some  criterion  measure,  such  as  the  ability  to  complete  a  course  of 
training  within  a  prescribed  or  reasonable  period  of  time.  The  test  has 
been  made  culture  free,  but  it  has  little  or  no  predictive  validity. 

The  argument  could  be  made  that  the  fault  lies  in  the  criterion,  and 
not  the  test.  In  this  case,  a  third  fundamental  question  regarding 
the  testing  movement  arises,  and  that  is  the  construct  validity  of  a 
test  or  what  is  the  test  supposed  to  be  measuring?  (Goslln,  1968). 

As  explained  in  the  preceding  argument,  the  creation  of  a 
culture- free  test  places  a  greater  burden  on  the  construct  validity  of 
the  test  rather  than  its  predictive  validity,  since  it  may  not  be 
possible  to  determine  the  latter  in  the  traditional  manner.  In  addi¬ 
tion  to  escaping  criticism  for  being  biased,  a  culture-free  test  of 
mental  ability  with  high  construct  validity  would  be  of  great  value  to 
the  military  services  and  other  large  institutions  that  face  increasingly 
difficult  problems  in  personnel  procurement  owing  to  the  shrinking  of 
the  pool  from  which  new  recruits  must  be  obtained  (Congressional  Budget 
Office,  1977).  Under  these  circumstances,  if  standards  are  not  to  be 
lowered,  means  must  be  found  to  identify  individuals  with  high  native 
ability  who  do  not  score  well  on  traditional  tests.  It  was  the  purpose 
of  this  project  to  explore  the  posslblity  of  developing  such  a  test  that 
was  relatively  culture- free,  had  high  construct  validity  with  respect 
to  identifying  individuals  of  high  native  ability, and  would  be  feasible 
and  practical  to  administer  in  the  military  testing  environment. 

TEST  DEVELOPMENT 


THE  MODEL 

The  first  problem  in  developing  the  test  was  to  find  a  model  upon 
which  to  build  the  test.  A  model,  in  this  usage,  is  a  procedure  or 
paradigm  that  reliably  elicits  for  quantitative  measurement  a  behavior 
that  is  the  result  of  a  cognitive  process  that  is  frequently  involved 
in  many  situations  in  real  life.  Models  of  this  sort  would  be  available 
in  such  traditional  experimental  areas  as  learning  and  memory,  informa¬ 
tion  processing,  problem  solving,  and  decision  making.  It  was  felt  that 
most  of  the  paradigms  for  information  processing  placed  an  overly  high 


500 


emphasis  on  verbal  behavior  and  materials  and  that  this  feature  would 
make  it  difficult  to  achieve  a  culture-free  test.  The  problem¬ 
solving  paradigm  was  thought  to  be  inappropriate  for  test  construction 
from  a  reliability  and  measurement  standpoint,  since  an  attempt  to  con¬ 
trol  and  standardize  the  set  or  approach  an  Individual  takes  would  tend 
to  destroy  the  objectives  of  the  paradigm,  itself,  which  encourages 
experimentation  by  the  subject.  Also,  the  frequency  of  chance  or  Ma 
ha"  solutions  would  tend  to  make  test  scoring  difficult,  categorical,  and 
unreliable.  The  decision-making  paradigm  was  not  considered  appropriate 
because  of  the  paradigm's  reliance  on  value  systems  In  the  elicited 
behav1or--value  systems  developed  through  life  experiences  and  very  much 
the  product  of  an  individual's  culture. 

This  left  the  area  of  learning  as  a  logical  choice  for  the  model. 
Learning  paradigms  have  been  the  traditional  vehicle  of  the  majority  of 
research  In  the  behavioristic  tradition,  and  learning  ability  is  gen¬ 
erally  recognized  as  an  Important  ingredient  in  an  individual's  adapta¬ 
tion  to  a  job.  In  the  industrial  engineer's  armamentarium,  the 
"learning  curve"  is  an  important  ingredient  for  an  entire  production 
process.  There  are  many  reliable  measures  of  the  learning  proc*ss--at 
least  in  the  aggregate.  And  the  law  of  effect,  in  its  empirical  form, 
is  without  precedence  among  the  many,  so-called  "laws"  in  psychology. 

As  quoted  and  discussed  by  Estes  (1974),  Thorndike  believed  that  intel¬ 
lect  is  the  ability  to  learn  and  that  estimates  of  intellect  should  be 
estimates  of  the  ability  to  learn.  In  another  sense,  Thorndike  believed 
that  intellect  is  the  ability  to  learn  more  things  or  to  learn  the  same 
things  mere  quickly.  Typical  intelligence  tests  that  sample  the  products 
an  individual  is  able  to  produce  seem  to  be  assessing  intelligence  with 
respect  to  the  amount  of  stn-ed  information,  knowledge,  and  intellectual 
skills,  whereas  the  typical  experimental  learning  paradigm  would  seem  to 
consider  the  rate  of  learning  as  a  measure  of  intellectual  performance. 

Within  the  field  of  learning,  visual  discrimination  learning  was 
selected  as  the  general  paradigm  in  which  to  build  the  test  because  it 
has  been  widely  used  at  many  phylogenetic  levels  to  study  thv>  evolution 
of  intelligence  (Bltterman,  1965,  1975).  There  is  also  an  extensive 
literature  in  the  visual  discrimination  learning  of  human  subjects  as 
well  (Green  and  O'Connell,  1969).  The  typical  paradigm  for  visual 
discrimination  learning  involves  two  or  more  dissimilar,  visual  stimuli 
of  which  one  has  been  arbitratily  designated  as  correct.  The  organism 
learns  to  respond  to  the  correct  alternative—e.g. ,  peck  the  middle 
disc--by  being  reinforced  for  making  the  correct  choice. 

Examination  of  the  Green  and  O'Connell  (1969)  bibliography  will 
show  that  most  of  the  experimental  tasks  in  visual  discrimination  learn¬ 
ing  huve  been  relatively  simple  owing  to  the  design  of  such  tasks  for 
animals,  children,  and  retardates.  The  visual  discrimination  learning 
situation  has  been  made  more  complex  by  manipulating  reinforcement 
contingencies  or  the  quality  of  reinforcements.  In  their  altered  form, 
emphasis  has  been  cn  such  phenomena  as  reversal  learning,  probablli  ty 
learning,  and  the  effects  of  partial  reinforcement  and  incentive  con¬ 
trasts.  Bittern  has  shown  that  the  acquisition  (learning)  curve  may 
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be  very  similar  for  all  organisms,  but  the  switch  to  one  of  the  other 
conditions  following  original  learning  has  led  to  qualitatively 
different  behaviors  by  different  species.  Thus,  it  would  be  highly 
desiraole  to  adhere  to  the  basic  learning  paradigm  but  make  the  task 
more  demanding  for  the  human  subject.  This  could  be  done  by  having  an 
individual  learn  several  discriminations  simultaneously,  which  shall  be 
called  multiple  discrimination  learning.  Except  for  the  fact  that 
pictorial  materials  would  oe  used,  the  situation  would  be  very  similar 
to  verbal  discrimination  learning  (Eckert  and  Kanak,  1972).  In  a 
typical  verbal  discrimination  learning  experiment,  a  list  of  several 
word  pairs  is  created  In  which  one  member  of  each  pair  has  been  desig¬ 
nated  as  the  correct  alternative.  The  pairs,  referred  to  as  items, 
are  presented  Individually  and  a  complete  presentation  of  the  list  is 
a  trial.  The  subject  Instrumental ly  learns  the  correct  alternatives  by 
being  reinforced  when  the  correct  member  of  the  word  pair  is  vocalized. 
Arima  (1974)  has  shewn  that  the  paradigm  is  very  robust  in  the  sense 
that  the  learning  rate  Is  constant  regardless  of  the  number  of  alterna¬ 
tives  (up  to  four)  presented  in  a  stimulus  (item)  as  long  as  the  informa¬ 
tion  presentation  rate  is  also  constant.  The  key  to  determining  this 
relationship  was  the  measurement  of  information  content  In  terms  of 
Shannon  bits  and  learning  in  terms  of  the  Information  transmission  rate. 

To  recapitulate,  the  model  for  the  test  was  a  visual  discrimination 
paradigm  presented  in  the  manner  of  verbal  discrimination  learning 
experiments.  That  is,  the  model  calls  for  the  subject  to  learn  several 
visual  discriminations  simultaneously,  a  process  that  will  be  referred 
to  as  multiple  discrimination  learning. 

STIMULUS  MATERIALS 

Construction  of  a  multiple  discrimination  learning  test  required 
\  relatively  large  set  of  stimuli  that  were  homogeneous,  yet  discrim¬ 
inate,  and  which  were  as  free  of  cultural  Influence  or  implications  as 
possible.  Homogeneity  of  stimulus  materials  was  desired  so  that  each  of 
the  stimulus  pairs  within  a  "list"  could  be  of  comparable  difficulty 
and  so  that  any  stimuli  pair  would  be  representative  of  the  test  task. 
Geometric  shapes  were  eliminated  because  of  their  limited  numbers  and 
the  possibility  that  their  familiarity  and  association  values  might  be 
linked  with  cultural  variables.  Color,  hue,  and  brightness  were  also 
rejected  because  of  the  difficulty  in  production  and  replication  and  be¬ 
cause  difficulties  in  sensory  discrimination  might  result  when  a  large 
norther  of  items  was  required.  Additionally,  there  would  be  the  problem 
of  using  the  test  with  colorblind  individuals,  for  these  reasons,  two- 
dimensional,  black-and-white  patterns  of  uniform  size  were  investigated. 
The  set  of  30,  two-dimensional,  random-shaped,  metric  polygons  used  by 
Amoult  (1956)  was  found  to  fit  the  requirements  admirably.  They  are 
shown  in  Figure  1.  Moreover,  they  had  already  been  categorized,  as  a 
group,  as  figures  having  high  discriminability. 

Prior  to  constructing  pairs  and  lists  of  items  using  the  forms, 
it  was  necessary  to  obtain  measures  of  the  pairwise  similarity  of  the 
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FIGURE  1.  Shapes  selected  for  use  in  assembling 
stimulus  lists. 

(From  Arnoult,  1956) 
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forms  and  to  dt-velop  a  set  of  pairs  for  which  there  would  be  assurance 
that  either  member  would  be  likely  to  be  chosen  as  a  correct  alternative 
on  a  first  (guess)  trial.  It  was  particularly  necessary  to  develop 
pairs  with  an  a  priori  choice  of  50-50  for  either  member  so  that  the  In¬ 
formation  content  (uncertainty )  of  each  Item  would  be  at  a  maximum  (1 
bit)  and  constant  within  all  lists.  The  similarity  measure  was  desired 
because  similarity  had  been  found  to  be  a  significant  variable  affecting 
learning  rate  In  verbal  learning  under  some  conditions.  Accordingly,  It 
was  assumed  that  similarity  among  and  between  the  stimuli  should  be  con¬ 
trolled  In  constructing  the  test  items. 

In  order  to  obtain  empirical  values  for  these  relationships  among 
the  forms,  a  small,  data-gatherlng  experiment  was  conducted.  The  30 
stimulus  polygons  were  arranged  In  pairs.  All  possible  pairs  were  con¬ 
structed  under  the  constraint  that  fn  Item  would  not  be  paired  with  It¬ 
self.  Left-right  order  within  a  given  pair  was  not  considered.  This 
resulted  In  the  assembly  of  (30  x  29)/2  *  435  different  pairings.  These 
pairs  were  then  arranged  In  three  columns  on  sheets.  Three  separate 
booklets,  each  containing  145  pairs,  were  constructed  and  distributed  to 
60  graduate  students  at  the  Naval  Postgraduate  School.  Each  subject 
received  a  single  booklet  selected  at  random  from  the  three,  and  was 
asked  to  perform  two  separate  tasks— selection  of  one  item  from  each  pair 
and  rating  of  the  degree  of  similarity  seen  between  the  Items  of  each 
pair.  Subjects  were  told  that  one  Item  in  each  pair  had  been  arbitrarily 
designated  as  "correct,"  l.e.,  the  desired  response,  and  were  asked  to 
designate  that  Item  whlcn  they  thought  to  be  the  "correct"  response.  This 
selection  was  to  be  made  with  the  knowledge  that  designation  of  the  "cor¬ 
rect"  response  was  made  completely  arbitrarily. 


Subjects  were  cautioned  to  make  their  choices  solely  on  the  basis 
of  a  given  pair  alone,  and  without  regard  to  previous  selections.  This 
exercise  was  Intendo?  to  simulate  as  closely  as  possible  the  condition  of 
facing  a  stimulus  pair  In  a  forced-choice  situation  with  no  prior  know¬ 
ledge  of  the  correct  Item  In  the  pair. 


Subjects  then  went  through  the  list  a  second  time,  rating  each 
pair  as  to  whether  the  two  Items  in  each  appeared  to  be  very  similar, 
slightly  similar,  or  dissimilar.  Each  pair  was  then  assigned  a  simil¬ 
arity  factor  of  one,  two,  or  three,  respectively. 

The  choice  preferences  of  the  60  subjects  (20  for  each  set  of  145 
pairs)  were  translated  into  percentages  and  cast  Into  a  matrix.  In 
addition,  averages  of  similarity  ratings  given  for  each  pair  were  computed 
and  cast  into  the  same  matrix  format.  Thus  pairwise  estimates  of  choice 
preference  and  Item  similarity  were  obtained  and  placed  In  usable  form. 

CONSTRUCTION  OF  TEST  LISTS 

A  subgroup  of  pairs  was  selected  from  the  original  43r  that  had 
Hean  rated.  These  pairs  were  singled  out  on  the  basis  of  crc-ice  prefer¬ 
ence.  Subjects  making  choices  within  these  pairs  had  dlspU/ed  no 


significant  preference,  on  the  average,  for  either  item  In  each  pair 
(selections  were  distributed  either  50X-50t  or  45X-55X  between  each). 

This  subgroup  was  then  used  to  construct  the  test  lists.  Since  nc  marked 
preference  for  a  given  item  in  a  pair  had  been  demonstrated,  it  was  felt 
that  the  choice  probabilities  associated  with  each  could  be  considered 
to  be  “equally  likely"  for  the  purposes  of  evaluating  the  information 
content  of  the  choice  associated  with  each  pair. 

Three  stimulus  lists  of  six  pairs  each  were  constructed  from  the 
"equally  likely"  subgroup  of  pairs.  These  lists  were  assembled  under  the 
following  constraints  with  respect  to  the  similarity  variable: 

list  I.  Figures  in  each  pair  were  as  dissimilar  as  possible. 

In  addi lion,  all  figures  In  the  entire  list  were  as  dissimilar  as 
possible.  (Wlthln-pair  similarity  factors  were  at  least  2.50,  averaging 
2.60,  while  between-palr  factors  were  not  less  than  1.75,  averaging  1.98.) 

List  II.  Figures  in  each  pair  were  as  similar  as  possible,  but 
dissimilarity  between  pairs  was  maintained.  (Wlthln-pair  similarity 
factors  were  no  greater  than  1.95,  averaging  1.58;  the  between-palr 
factors  were  no  less  than  1.90,  averaging  2.20.) 

List  III.  Figure*  were  as  similar  as  possible,  both  within  each 
pair  and  between  other  figures  in  the  list.  (Within  pair  similarity 
factor  was  no  more  than  1.90,  averaging  1.73;  between-pair  factor  was 
no  greater  than  2.30,  averaging  1.92.) 

These  lists  are  presented  In  Figures  2,  3,  and  4,  respectively. 

As  can  be  seen,  the  lists  were  constructed  In  order  to  present  discrimin¬ 
ation  tasks  of  increasing  difficulty.  Stimulus  items  In  List  I  were 
chosen  to  bs:  as  distinguishable  as  possible,  minimizing  intra-  and 
interpair  confusion.  Similarity  within  pairs  was  added  in  List  II,  but 
each  pair  was  kept  as  distinguishable  as  possible  from  other  pairs  in 
the  list.  Similarity  was  extended  to  cover  all  items  in  List  III. 

List  III,  of  course,  is  the  most  homogeneous. 

When  lists  of  six  pairs  each  had  been  completed,  three  test  lists 
of  60  pairs  were  assembled.  Each  t*st  list  consisted  of  10  repetitions 
of  each  of  the  six  pairs  of  Lists  I,  II,  and  III.  Order  within  these 
replicates  was  random.  Left-right  order  within  pairs  was  varied  in  a 
random  fashion  as  well  with  the  restriction  that  a  given  form  was  seen 
on  the  right  five  times  and  on  the  left  five  times.  At  least  one  differ¬ 
ent  pair  was  presented  before  a  given  pair  was  repeated.  The  polygons 
were  not  rotated  or  reversed,  but  were  presented  "upright"  at  all  times. 

Thus  each  tnst  subject  could  be  presented  a  total  of  60  pairs  of 
stimul'.  Pairs  appeared  in  no  apparent  order,  and  the  correct  response 
was  net  always  on  either  the  right  or  left  side;  subjects  were  forced  to 
learn  the  correct  response  in  each  pair  solely  on  the  basis  of  recogni¬ 
tion  of  the  items  within  that  pair  alone. 


i  ^ 

pair  1  * 

A  » 

*  pair  2 

^  x 

*  pair  3 

*  pair  4 

*  pair  5 

*  r 

*  pair  6 

FIGURE  2.  Stimulus  List  I. 

(Least  similarity  Within  and 
ba* tween  pairs) 

♦Indicates  "correct"  shape 
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pair  1  * 

t  ^ 

pair  2  * 

r 

*  pair  3 

(k*  ^ 

pair  4  * 

y  * 

*  pair  5 

r  t 

*  pair  6 

FIGURE  3.  Stimulus  List  II. 

(Maximum  similarity  within  pairs;  minimum  similarity 
between  pairs.) 

•Indicates  "correct"  shape. 
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*  pair  1 

r  s 

pair  2  * 

*  pair  3 

pair  4  * 

A  C 

*  pair  5 

A 

*  pair  6 

FIGURE  4.  Stimulus  Liat  III. 

(Maximum  similarity  both  within  and  between  pairs.) 
•Indicates  "correct"  shape. 
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TEST  APPARATUS 


{ 

f 

f 


t 


Test  apparatus  was  designed  to  provide  maximum  flexibility  In 
test  administration.  The  apparatus  array  used  In  administering  the  test 
Is  diagramed  In  Figure  5.  Critical  units  of  the  presentation  and  response 
equipment  were  secured  In  place  throughout  the  course  of  test  adminis¬ 
tration.  Distance  from  the  subject  (edge  of  table)  to  the  viewing  screto 
was  42. u  inches  (107.95  cm);  reinforcement  lights  were  located  8.5  inches 
(21.59  cm)  In  front  of  the  screen.  Stimulus  pairs  occupied  an  area  on 
the  screen  approximately  6  Inches  (15.24  cm)  high  by  9  Inches  (22.86  cm) 
wide. 


Stimulus  pairs  were  mounted  on  35  mm  slides,  one  pair  to  a  slide. 
Since  each  list  was  presented  a  total  of  10  times,  the  60  slides  required 
for  each  list  were  placed  In  a  carousel.  Stimuli  were  rear  projected 
onto  a  Kodak  shadow-box  screen  using  a  Kodak  Ektographlc  Carousel  slide 
projector.  A  neutral  light-reduction  filter  (Kodak  Wratten  gelatin 
filter,  no.  96  ND  0.50),  rated  to  reduce  light  transmission  by  50  percent, 
was  fixed  over  the  projector  lens  to  reduce  excessive  glare  on  the  screen. 

A  modified  Ohr-tronlcs  eight-channel  paper-tape  reader  was  used  to 
control  the  reinforcement  lights  (described  below)  so  that  only  correct 
responses  would  receive  reinforcement.  Hiring  was  accomplished  so  that 
the  pulse  used  to  advance  the  slide  projector  to  the  next  stimulus  pair 
also  advanced  the  tape  reader.  Tapes  were  punched  to  coordinate  with  the 
ordering  of  the  stimulus  list  in  use. 

The  apparatus  was  designed  to  permit  a  machine-  or  self-paced 
mode  of  presentation.  Stimulus  presentation  rate  In  the  machine-paced 
mode  was  controlled  by  an  Interval  timer.  The  timer  was  set  to  provide 
an  actuating  pulse  to  both  projector  and  tape  reader  simultaneously  every 
4.0  seconds.  The  time  required  for  the  slide  projector  to  cycle  from  a 
presented  slide  to  the  next  slide  was  found  to  be  1.0  second.  Since  the 
projection  screen  was  blank  during  this  cycle  time,  the  stimulus  pairs 
were  visible  for  only  3.0  seconds  before  the  timer  initiated  the  next 
sequence. 

Stimulus  presentation  during  the  self-paced  mode  was  controlled 
by  either  of  two  identical  buttons  located  on  the  sides  of  the  response 
box.  Pressing  either  of  these  buttons  initiated  the  electrical  pulse 
that  advanced  the  slide  projector  and  tape  reader.  (These  buttons  were 
inactivated  during  the  machine-paced  mode  to  preclude  accidental  dis¬ 
ruption  of  the  stimulus  presentation  rate.) 

Two  Identical  buttons  fixed  on  top  of  the  response  box  were  used 
to  designate  choices.  Correct  responses  were  reinforced  by  one  of  a 
pair  of  2.5  watt  lights  placed  on  a  small  box  directly  In  front  of  the 
viewing  screen.  Incorrect  responses  received  no  reinforcement.  Respon¬ 
ses,  regardless  of  reinforcement,  were  recorded  on  a  two-channel  Clevlte 
brush  recorder.  The  tapes  thus  obtained  could  be  used  to  confirm  observed 
responses,  and  in  the  self-paced  mode  to  measure  inter- response  time  and 
total  test  time. 
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Response 

Buttons 


Twenty-eight  volt  DC  current  te  power  the  tape  reader  and 
reinforcement  lights  was  obtained  from  a  Power  Designs,  Inc., 

Model  3650-S  DC  Power  Supply. 

TR'.AL  ADMINISTRATION 

In  order  to  evaluate  the  characteristics  of  the  constructed  test 
under  conditions  as  close  to  operational  as  possible,  and  also  to 
Investigate  the  appropriateness  of  the  various  test  parameters  (list 
length,  similarity,  etc.),  It  was  decided  to  administer  the  test  to 
as  many  subjects  as  possible  during  a  five-week  period  in  which  they 
*ould  be  available. 

METHOD 

Facilities 

Testing  was  conducted  at  the  Naval  Training  Center  (NTC),  San 
Diego,  California.  All  testing  was  performed  In  an  Isolated  room  at 
the  Personnel  Testing  and  Classification  Center  of  the  NTC.  Since 
activity  was  planned  for  both  morning  and  afternoon  periods,  windows  In 
the  testing  room  were  covered  with  opaque  material  to  reduce  antici¬ 
pated  glare  from  sunlight  and  to  achieve  uniform  lighting  conditions  In 
the  room. 

Subjects 

Subjects  tested  were  160  male  U.S.  Navy  recruits  at  NTC.  Ages 
ranged  from  17  to  26  years,  with  the  average  being  19  years..  Average 
stated  schooling  level  for  the  group  was  12th  grade  (11.78).  School¬ 
ing  level  within  the  nonwhite  subgroup  was  slightly  higher  (12.2  years) 
than  the  group  average.  Nonwhite  subjects  were  predominantly  Negro, 
although  the  sample  contained  Oriental,  Malay  (Filipino),  and  Mexlcan- 
American  recruits  as  well.  Subjects  were  assigned  to  the  various 
test  conditions  in  order  of  appearance. 

Test  Design 

The  experiment  was  conducted  using  four  test  groups.  Forty-four 
subjects  were  given  the  test  using  self-pacing  to  control  the  stimulus 
presentation  rate.  Test  List  I  was  used  throughout  the  self-paced 
phase.  The  remaining  three  groups  used  the  machine-paced  mode  to  present 
the  stimulus  pairs  at  a  constant  rate  of  one  each  4  seconds.  In  the 
three  machine-paced  phases,  43,  40,  and  33  subjects  were  tested  using 
Test  Lists  I,  II,  and  III,  respectively.  Taoular  representation  of 
this  test  design  Is  shown  In  Table  1.  There  It  can  be  seen  that  the  test 
variables  were  pacing  mode  (self-  and  machine-paced)  and  test  list, 
with  the  latter  being  nested  under  the  machine-paced  mode. 
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Table  1. 


Test  Design 


Test 

Group 

Subjects 

(White;  Nonwhite) 

Pacing 

Stimulus 

List 

1 

44 

(31; 

13) 

Self 

I 

2 

43 

(30; 

13) 

Machine 

I 

3 

40 

(31; 

9) 

Machine 

II 

4 

33 

(29; 

4) 

Machine 

III 

Procedure 

Subjects  were  brought  Into  the  testing  room  In  groups  of  not  more 
than  six.  The  apparatus  was  displayed,  and  the  experimental  nature  of 
the  testing  explained  briefly  prior  to  Issuing  the  verbal  Instructions. 
Instructions  emphasljed  the  nature  of  the  stimuli,  what  was  required  of 
the  subject  In  the  way  of  response,  and  the  operation  of  the  apparatus 
Itself.  Subjects  were  then  given  the  opportunity  to  ask  questions 
about  the  test  and  procedure,  and  to  decline  participation  If  they  so 
desired.  They  were  then  asked  to  wait  outside  the  room  and  were  brought 
in  for  testing  one  by  one.  The  Instructions  for  the  test  were  then 
reviewed  with  each  Individual  as  he  was  seated  at  the  response  box 
prior  to  conwencement  of  the  experiment. 

Stimulus  pairs  were  then  presented  one  by  one  on  tne  viewing 
screen  for  his  test  condition.  Each  group  of  six  pairs  was  presented 
in  10  consecutive  trials  with  no  break  between  groups.  As  a  subject 
selected  the  figure  In  each  pair  that  he  thought  was  correct,  he  pressed 
the  corresponding  (right  or  left)  response  button  In  front  of  him. 
Correct  responses  were  reinforced  by  a  small  light  In  front  of  the  view 
screen,  while  Incorrect  responses  received  no  relnformcement. 

As  testing  was  in  progress,  the  experimenter  stood  behind  the  sub¬ 
ject  and  recorded  his  responses  on  an  answer  sheet.  Responses  were 
also  recorded  electrically  on  a  two-channel  Brush  recorder.  Upon  com¬ 
pletion  of  the  test,  the  subject  was  cautioned  not  to  discuss  anything 
he  had  seen  or  done  In  the  test  with  those  who  had  not  yet  been  tested. 
This  request  was  repeated  to  the  entire  group  after  all  had  been  through 
the  test. 
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Performances  by  six  of  the  original  160  subjects  were  discarded. 
Improper  operation  of  the  self-pacing  buttons  that  put  the  tape  reader 
out  of  phase  with  the  projector  was  cause  for  rejection  of  three  per¬ 
formances.  Another  subject  in  the  first  (self-paced)  group  was  unable 
to  follow  Instructions.  Timer  malfunction  caused  two  performances  in 
the  first  machine-paced  group  to  be  eliminated. 

Seventeen  other  subjects'  performances  were  not  used  In  the  data 
analysis  because  of  their  Navy  Basic  Test  Battery  (BTB)  scores  and/or 
demographic  data  could  not  be  retrieved  from  computerized  records.  As 
a  result  of  these  subject  losses,  the  137  remaining  subjects  (white  and 
nonwhite)  were  distributed  as  follows:  Group  1  (24,  11);  Group  2 
(25,  12);  Group  3  (28,  8);  and  Group  4  (30,  3). 

RESULTS 

Individual  performances  In  the  test,  in  the  form  of  number  of 
correct  choices  made  per  trial  per  unit  of  time,  were  computed  to 
arrive  at  the  test  measure  of  effectiveness.  Information  Processing 
Rate  (IPR).  Specifically.  IPR  was  defined  as  bits  of  Information  cor¬ 
rectly  processed  per  second.  Performances  In  the  first  trial  were  not 
used,  since  responses  in  the  initial  trial  were  dependent  wholly  upon 
chance,  and  as  such  were  not  indicative  of  learning  ability. 

The  number  correct  In  each  trial  was  divided  by  the  amount  of 
time  the  stimuli  were  presented  to  the  subject.  (In  the  machine- 
paced  mode,  this  was  a  constant  3  seconds  per  pair.  Scores  for  the 
self-paced  group  were  scaled  to  Individual  rates.)  In  both  situa¬ 
tions,  the  1-sec.  cycle  time  (inter-stimulus  time)  of  the  slide  pro¬ 
jector  was  not  included  In  computing  IPR.  The  resultant  trial  IPR 
scores  were  grouped  Into  three  blocks  of  three  consecutive  trials  each. 
These  figures  are  listed  In  Table  2.  Rates  of  processing  Information 
are  seen  to  generally  increase  over  blocks  of  trials  for  all  groups. 

(The  single  exception  is  the  nonwhite  subset  of  Test  Group  4,  where 
performance  declines  very  slightly  over  trials.  This  group  contained 
three  subjects.)  Overall  performances  by  all  groups  were  quite  similar, 
despite  differences  in  pacing  mode  and  stimulus  similarity  between 
groups.  Overall  performance  by  the  nonwhites  in  Test  Group  1  (self- 
paced)  exceeded  that  of  the  whites;  the  reverse  was  true  for  the  three 
machine-paced  groups.  Figures  5  and  6  depict  aspects  of  these  situa¬ 
tions. 


The  results  listed  in  Table  2  were  subjected  to  an  analysis  of 
variance  using  a  three-way  design  compensating  for  unequal  cell  popula¬ 
tions  by  test  group,  racial  group,  and  blocks  of  trials  as  described 
by  Kirk  (1968).  The  results  of  this  analysis  are  presented  in  Table  3. 
Significant  effects  were  noted  between  racial  groups  and  among  blocks 
of  trials.  The  blocks  effect  Is  Important  from  the  construct  validity 
standpoint  in  demonstrating  that  learning  did  occur  over  all  conditions 
of  the  experiment.  It  should  also  be  noted  that  pacing  mode  and 
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INFORMATION  PROCESSING  RATE  (BITS/SEC) 
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FIGURE  6.  Information  Processing  Rate  by  Racial  Group, 
Pacing  Mode,  and  Blocks  of  Trials. 
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Table  3. 

Analysis  of  Variance  of  Overall  Performance  by  Test 
Group,  Racial  Group,  and  Blocks  of  Trials 


Term 

df 

SS 

MS 

F 

E 

Total 

243 

1,829,659. 50 

— 

— 

— 

Test  Group  (T) 

3 

5,082.50 

1,694.10 

0.230 

n.s. 

Racial  Group  (R) 

1 

31,511.00 

31,511.00 

4.288 

<.05 

Trial  Block  (B) 

2 

117,910.00 

58,955.00 

8.023 

<.001 

T  X  R 

3 

24,396.00 

8,131.90 

1.106 

n.s. 

T  X  B 

6 

10,165.00 

1,694.10 

0.230 

n.s. 

R  X  B 

0 
i  / 

21,346.00 

10,673.00 

1.452 

n.s. 

T  X  R  X  B 

6 

13,214.00 

2,202.30 

0.299 

n.s. 

Error 

220 

1 , 6i 6 , 200 . 00 

7,347.60 

— 

— 

similarity  were  confounded  in  the  test  group  variable  in  this  analysis, 
but  had  the  primary  effects  of  either  of  these  variables  been  sub¬ 
stantial,  the  analysis  would  have  resulted  In  a  significant  F  for  the 
test  group  variable.  On  the  other  hand,  if  the  effects  of  both  variables 
had  been  substantial,  the  effects  on  the  test  group  variable  would  have- 
been  Indeterminate  because  of  the  possibility  that  the  effects  of  one 
might  cancel  the  effects  of  the  other. 

In  order  to  assess  the  effects  of  pacing  mode,  an  analysis  of 
variance  was  conducted  using  the  total  IPR  as  the  dependent  measure 
and  racial  group  and  pacing  as  the  independent  variables.  Racial  group 
was  included  in  the  analysis  because  of  tne  possible  Interactive  effect 
with  the  pacing  variable,  as  suggested  in  Figure  6.  With  the  data 
collapsed  over  blocks  of  trials,  the  racial  variable  was  not  significant 
(Table  4).  The  pacing  effect  was  not  significant  and  the  hypothesized 
interactive  effect  attained  a  F  value  that  was  between  the  .10  and  .20 
levels  of  probability. 

in  order  to  assess  a  possible  similarity  effect,  an  analysis  of 
variance  was  conducted  using  the  total  IPR  as  the  dependent  measure 
and  racial  group  and  similarity  (stimulus  set)  as  the  Independent 
variables.  Only  the  machine-paced  test  groups  were  used  for  this 
analysis.  The  results,  shown  in  Table  5,  found  racial  group  to  be 
significant  at  less  than  the  2  percent  level  of  probability,  while 
similarity  and  the  interaction  term  were  not  statistically  significant. 

In  addition  to  the  Implications  for  the  similarity  variable,  the  compar¬ 
ative  analysis  provided  by  tables  4  and  5  with  respect  to  race  Indicate 
that  race  did  have  a  significant  effect  when  the  subjects  were  machine- 
paced  but  not  when  they  were  allowed  to  pace  thefi.sel ves. 

Finally,  in  order  to  confirm  that  subjects  showed  a  significant 
difference  In  their  learning  rates,  as  one  would  expect  from  the 
sizable  error  terms  in  all  of  the  preceding  analyses,  several  analysis 
of  variance  tests  were  conducted  using  a  repeated  measure;  design  with 
subjects  and  blocks  of  trials  as  the  independent  variable?,  and  the  Inter¬ 
action  of  these  two  effects  as  the  error  term.  The  dependent  variable 
was  the  IPR  per  subject  per  block.  Four  such  tests  were  conducted  by 
partitioning  the  total  sample  by  race  and  pacing  mode.  The  F  ratios 
were  all  highly  significant  for  subjects  and  blocks  of  trials  with  most 
of  them  at  the  .001  level  of  probability. 

Internal  reliability  of  the  test  itself  was  Investigated  using 
a  split-half  design  for  each  test  group  and  each  racial  group  as  well 
as  for  overall  performances.  Processing  rates  were  compared  for  trials 
4,  6,  and  8  against  those  of  trials  5,  /,  and  9.  In  addition,  scores 
on  the  latter  group  of  trials  were  compared  with  those  obtained  on 
trials  6,  8,  and  10.  The  former  comparison  will  be  referred  to  as 
"low  trials"  and  the  latter,  as  "high  trials." 

Correlation  coefficients  thus  obtained  were  used  in  the  Spearman- 
Brown  formula  for  split-half  correlations.  Both  the  raw  coefficients 
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Table  4 


Analysis  of  Variance  of  Overall  Performance 
by  Racial  Group  and  Pacing  Method 


Term 

df 

SS 

MS 

F 

E 

Total 

137 

465,094.994 

— 

— 

— 

Racial  Grp  (R) 

1 

4,417.475 

4,417.475 

1.310 

n,s. 

Pacing  Mode  (P) 

1 

242.501 

242.501 

0.071 

n.s. 

R  X  P 

1 

8,772.961 

8,772.961 

2.602 

n .  s. 

Error 

134 

451,662.057 

3,370.6X2 

— 

— 

Table  5 

Analysis  of  Variance  of  Overall  Performance 
by  Racial  Group  and  Stimulus  Set 
{Machine  -  Paced  Only) 


Term 

df 

SS 

MS 

F 

E 

Total 

102 

5,316.928 

- — 

— 

— 

Racial  Grp  (R) 

1 

342.169 

342.169 

6.810 

.020 

Stimulus  Set  (S) 

2 

4.758 

2.379 

0.047 

n.s. 

R  X  S 

2 

96.117 

48.058 

0.956 

n.s. 

Error 

97 

4,873.884 

50.246 

— 

_ 

"9*S«Bi9WWlW» 
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Table  6 

Split-Half  Reliability  Coefficients 


Low 

Trials 

'  High 

Trials 

Group 

(468 

vs  579) 

(579  va  6810) 

Totals 

r (raw) 

r{S-B) 

r  (raw) 

r (S-B) 

Low 

High 

White 

.767 

.868“ 

.713 

.832“ 

1 

.865“ 

.872“ 

Nonwhite 

.756 

.861“ 

.864 

.927“ 

White 

2 

.800 

.889“ 

.865 

.928“ 

.871“ 

.921“ 

Nonwhite 

.700 

.  824“ 

.826 

.905“ 

White 

3 

.615 

.762“ 

.632 

.775“ 

.722“ 

.759“ 

Nonwhite 

.367 

.537 

.535 

.697* 

White 

4 

.674 

.805“ 

.664 

.798“ 

.802“ 

„  794“ 

Nonwhite 

.637 

.778 

.610 

.758 

Totals 

White 

.835“ 

,843“ 

Nonwhite 

.788“ 

.  873“ 

Combined 

.  824“ 

.851“ 

.838 

** 

‘Significant  at  £  <  .05. 
“Significant  at  £  <  .01. 


and  the  Spearman-Brown  coefficients  are  listed  in  Table  6.  A  majority 
of  the  coefficients  are  seen  to  be  statistically  significant. 

The  relationship  between  scores  cn  the  experimental  test  and  the 
traditional  methods  of  measuring  Navy  recruit  potential  was  Investigated 
using  the  test  subjects'  scores  or.  the  Navy  General  Classification  Test 
(GCT),  a  major  portion  of  the  standard  Basic  Test  Battery  (BTB).  The 
basis  for  the  GCT  lies  in  verbal  ability,  since  the  test  consists  of 
sentence  completions  and  verbal  analogies.  Test  scores  are  scaled  on 
s  normalized  distribution  with  a  mean  of  5C  and  a  standard  deviation  of 
10.  Performance  on  the  Arithmetic  Reasoning  Test  (ARI)  is  often  com¬ 
bined  with  GCT  scores  to  obtain  a  rough  "multiple"  used  in  determining 
Navy  technical  school  eligibility  and  aptitude. 

Pearson  product-moment  correlations  were  computed  between  test 
scores  and  GCT  scores  obtained  from  individual  service  files.  (One 
nonwhite  subject  was  dropped  from  this  analysis  because  his  GCT  score 
was  not  available.)  These  correlations  were  determined  for  racial  sub¬ 
groups  of  subjects  falling  below  and  above  the  GCT  mean  score  of  50, 
for  both  racial  groups  in  toto,  and  for  the  entire  sample.  These  figures 
are  seen  in  Table  7.  Significant  values  of  the  correlation  coefficient 
are  noted  only  in  the  white  group  as  a  whole  and  for  the  entire  sample. 
Nonwhite  test  scores  did  not  correlate  significantly  with  GrT  performance. 


Table  7 

Correlations  of  Test  Performance  (IPR)  with  Navy 
General  Classification  Test  (GCT)  Score 


Group  Averages 

Correlat 

ion  Coefficient 

Group 

GCT 

I PF. 

GCT  GRP 

Race  GRP  Total 

Low 

( <  50 ) 
N«24 

42.67 

.2  08 

.  316 

Nonwhite 
N=  3  3 

.213 

High 

( >  50 ) 

56.89 

.207 

.601 

N*9 

.270** 

Low 

<  < 50) 
N=  1 7 

42,18 

.207 

.253 

White 

I*) 

N=1Q4 

*  *  >  J 

High 

( >50) 

N  ~  8  7 

59.63 

.238 

.050 

•Significant  at  p  <  .05. 


**Signif ican.t  at  p  <  .01. 
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DISCUSSION 


CONSTRUCT  VALIDITY 

The  test  was  constructed  to  be  a  measure  of  learning  ability  with 
the  Implication  that  learning  ability  Is  a  manifestation  of  the  intel¬ 
lectual  capacity  of  a  person.  Differences  in  this  intellectual  capacity 
between  individuals  was  assumed  to  be  measurable  by  the  rate  with  which 
new  material  is  learned.  Using  IPR  as  the  rate  measure,  the  results 
of  the  trial  administration  of  the  test  showed  that  learning  took  place 
and  that  the  rate  was  different  among  individuals.  Moreover,  the  results 
were  found  to  be  highly  rel1able--especia11y  for  a  4-minute  test— using 
an  internal  (split-half)  criterion  of  reliability.  Thus,  the  basic 
essential  requirements  for  the  construct  validity  of  the  test  would  seem 
to  have  been  adequately  demonstrated.  Additional  experimentation  would 
be  required  to  show  that  it  is,  indeed,  a  differential  measure  of  In¬ 
tellectual  capacity.  Probably  the  best  way  to  demonstrate  this  essential 
requirement  would  be  to  give  the  test  to  different  age  groups.  The 
fact  that  the  items  had  been  standardized  for  information  content  (1  bit 
per  ■‘tern)  would  make  it  possible  to  administer  shorter  forms  of  the  test— 
e.g.,  four  instead  of  six  items— to  different  age  groups  and  yet  have 
the  IPR  mean  the  same  when  corrected  for  total  information  content  of 
the  stimulus  lists. 

Earlier  in  this  paper,  H  was  stated  that  the  construct  validity 
of  a  test  required  an  answer  to  the  question.  What  does  the  test  measure? 
The  answer  given  here  is  learning  ability.  But,  as  Estes  (1974)  has 
argued,  a  product -defined  measure  of  intelligence  or  ability  does  not 
provide  an  understanding  of  what  intelligence  is.  Rather,  the  process 
should  be  defined  and  the  relationship  between  the  process  and  the  pro¬ 
duct  measure  should  be  determined.  The  design  of  this  trial  administra¬ 
tion  of  the  test  does  not  provide  opportunities  to  answer  the  process 
question.  Since  similarity,  however,  was  r.ot  a  significant  variable, 
visual  discrimination  of  the  stimuli  would  not  seem  to  have  been  involved 
in  the  learning  process.  Based  on  a  great  deal  of  research  in  recent 
years  in  the  area  of  human  learning  and  information  processing,  it  would 
be  safe  to  say  that  some  form  of  coding  of  the  individual  forms  and, 
probably,  the  stimulus  pairs  as  an  entity  was  required.  Additionally, 
short-term  memory  was  required  to  hold  the  information  pertaining  to  one 
item  in  working  memory  while  processing  a  new  item.  Here,  some  sort  of 
mnemonic  device  might  be  involved,  and  in  both  cases  verbal  fluency  and 
1(tfe{;«  formation  might  be  the  basic  skills  underlying  these  processes. 

With  respect  to  verbal  ability  playing  a  role,  the  small,  significant 
correlation  between  IPR  scores  and  the  GCT  scores  for  the  white  group 
would  support  this  contention.  Taken  in  conjunction  with  this  finding, 
the  absence  of  a  significant  correlation  for  the  nonwhite  group  could 
also  be  seen  as  not  disaffirming  the  trend,  if  it  is  assumed  that  the 
GCT  score  is  not  as  good  a  measure  of  verbal  ability  for  subjects  in 
the  nonwhite  group.  These  results,  however,  only  emphasize  that  the 
measure  of  verbal  fluency  or  the  capacity  to  generate  useful  images 
must  be  appropriate  to  the  cultural  background  of  th*  individual  subject. 
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CULTURAL  IMPLICATIONS 


If  the  subject$--wb1te  and  nonwhite--had  comparable  learning 
abilities,  no  racial  group  differences  would  be  found  on  the  IPR. 

The  study  found  no  significant  differences  among  the  self-paced  sub¬ 
jects,  but  a  significant  difference  was  found  for  racial  groups  in  the 
machine-paced  mode.  A  problem  in  attempting  to  determine  from  the 
experiment  data  whether  the  white  and  nonwhite  group  differed  in  learn¬ 
ing  ability  lies  in  the  fact  that  the  subjects  were  a  selected  group 
that  was  not  representative  of  America's  youth  in  general.  As  noted, 
the  average  education  level  was  at  the  12th  grade.  The  information  in 
Table  7  shows  that  60  percent  of  the  sample  was  above  the  median  in 
GCT  scores.  There  was  a  considerable  difference  In  racial  groups, 
however,  with  84  percent  of  the  white  group  being  above  the  50th  per¬ 
centile,  whereas  only  27  percent  of  th?  nonwhite  subjects  were  in  that 
category.  Ihere  was  a  small  but  significant  correlation  of  GCT  scores 
with  the  IPR,  but  only  for  the  white  group  and  the  entire  sample.  How 
can  these  data  be  related  to  the  cultural  implications  of  the  test? 

Kith  respect  to  the  differences  noted  in  the  paced  and  self-paced 
groups,  it  may  be  that  the  machine-paced  format  placed  greater  pressure 
on  the  subjects  and  generated  greater  test  anxiety.  Where  short-term 
memory  and  the  learning  of  discriminations  involving  very  similar  Items 
constitute  the  task,  the  effects  of  anxiety  could  be  disruptive  a'  shc.*^.) 
by  Taylor  and  Spence  (1952)  and  Ramond  (1953)  in  serial,  verbal  learning 
tasks.  For  anxiety  to  have  a  differential  effect  in  the  racial  groups, 
the  anxiety  induced  by  the  test  conditions  would  have  to  be  greater  for 
the  nonwhite  group.  This  could  be  true  as  a  part  of  the  larger  picture 
of  differences  in  test-taking  motivation,  attitudes,  experience,  and 
skill  that  have  been  attributed  to  different  cultural  backgrounds  If 
these  contentions  are  valid,  then  the  self-paced  mode  would  be  more 
culture-free  in  its  assessment  of  the  test  subject.  If  the  finding 
in  this  trial  administration  of  the  test  for  the  self-paced  condition 
should  hold  up  in  subsequent  administrations,  then  this  would  be  strong 
evidence  for  the  culture-fair  nature  of  this  test. 

The  pattern  of  correlations  between  the  IPR  and  the  subjects' 

GCT  scores  takes  the  form  that  Jensen  (1968b)  found  with  children  of 
high  and  low  socioeconomic  (SES)  groups.  Noting  that  children  from  low 
SES  backgrounds  with  IQs  in  the  range  of  60  to  80  appear  to  be  much 
brighter  In  social  and  nonscholastic  behavior  than  their  middle-  or 
upper-middle  SES  counterparts,  he  gave  groups  of  such  children  learning 
tasks  in  the  laboratory  and  compared  their  learning  performance  with 
standard  intelligence  test  scores  for  the  children.  There  was  a  sub¬ 
stantial  correlation  of  IQ  and  learning  scores  for  middle-class  children, 
but  the  correlation  was  negligible  for  children  from  low  SES  backgrounds. 
Jensen  attributed  the  difference  to  the  fact  that  the  learning  tasks 
and  the  intelligence  tests  measured  two  different  levels  of  intelligence 
with  the  lower  level,  measured  by  the  learning  tasks,  being  common  to 
both  groups  and  the  other  being  better  represented  within  the  high  SES 
group.  In  the  present  Instance,  it  would  seem  more  parsimonious  to  con¬ 
jecture  that  the  IPR  was  a  measure  of  intellectual  capability  for  both 


523 


groups,  whereas  the  GCT,  which  has  been  found  to  be  culturally  biased 
(Stephan,  1973;  Thomas,  1972c),  was  a  fair  measure  only  for  the  whice 
group.  In  addition,  the  significant  correlations  accounted  for  only 
a  very  small  portion  of  the  variance  In  IPR  scores.  Accordingly,  it 
would  appear  that  the  multiple  discrimination  test  Is  Indeed  culture 
fair  and  provides  an  unbiased  measure  of  learning  ability,  at  least 
in  the  self-paced  form.  Larger  and  more  numerically  balanced  samples 
from  an  unselected  population  would  be  necessary  to  confirm  these 
conclusions. 

TEST  AND  TESTING  CONSIDERATIONS 

Discussion  in  this  section  will  deal  with  the  psychometric  and 
physical  aspects  of  the  multiple  discrimination  learning  test.  Specif¬ 
ically,  the  length  of  the  test,  additional  matters  pertaining  to  the 
pacing  mode,  and  the  physical  packaging  of  the  test  will  be  considered. 

Test  Length 

The  decision  to  stop  the  test  after  10  trials  was  arbitrary. 
Several  subjects  showed  errorless  performance  within  this  limitation. 

In  the  machine-paced  mode  where  there  was  a  theoretical  limit  to  the 
IPR  of  .333  bits/sec.,  examination  of  the  third  block  of  trials  showed 
that  the  white  subjects  attained  a  maximum  of  80  percent  of  this  perfect 
learning  rate,  while  nonwhites  reached  69  percent  of  this  quantity. 

While  it  is  not  possible  to  tell  how  many  trials  are  required  for  per¬ 
fect  learning,  since  a  trials-to-criterion  design  was  not  used,  K  would 
be  advisable  from  a  psychometric  standpoint  to  stop  short  of  perfect 
learning  when  the  difference  in  learning  rate  among  subjects  is  more 
variable.  There  would  also  be  a  tradeoff  between  a  test  length  of  maxi¬ 
mum  discrimlnability  among  subjects  and  one  of  highest  reliability, 
which  might  not  be  the  same.  Thus,  the  optimum  test  length  is  not  a 
simple  question  that  yet  remains  to  be  determined. 

Pacing  Node 

It  has  been  previously  shown  that  pacing  mode  appeared  to  have  a 
difference  on  test  results  with  the  self-paced  mode  being  more  culture- 
fair.  From  a  psychometric  standpoint,  the  difference  between  the  two 
methods  is  that  the  self-paced  mode  places  no  limit  on  the  IPR  that  a 
subject  might  attain.  This  would  lead  to  greater  variability  among 
subjects  and,  presumably,  a  more  reliable  differentiation  among  test 
takers.  Since  many  more  variables  are  free  to  exert  their  effects  with 
the  self-paced  mode,  it  may  be,  however,  that  less  reliable  performance 
may  result.  The  self-paced  mode,  though,  should  be  more  representative 
of  the  manner  in  which  a  subject  approaches  and  deals  with  a  problem, 
and  the  results  of  the  testing,  as  a  consequence,  would  be  more  general- 
izable  to  real-life  situations  where  learning  is  required.  That  is,  it 
should  permit  greater  predictive  validity. 


The  self-pacing  h.ode  would  also  be  desirable  on  the  basis  of  the 
discussion  on  the  construct  validity  of  the  test.  There  It  was  stated 
that  the  rate  of  learning  would  be  the  measure  of  learning  ability,  and 
the  self-paced  mode  Is  the  only  one  that  permits  an  assessment  of  this 
measure.  The  highest  rate  In  this  study  was  .503  bits/sec.,  which 
occurred  in  the  nonwhite  subgroup  of  the  self-paced  condition.  Accord¬ 
ingly,  the  self-paced  mode  would  appear  to  be  the  better  procedure  for 
this  test. 

Physical  Packaging  of  the  Test 

The  type  of  stimulus  materials,  their  presentation  method,  and 
scoring  make  it  relatively  simple  to  Institutionalize  the  test  using 
teaching  machines  with  true-false  or  multiple-choice  response  provisions. 
Scoring  counters  could  be  readily  integrated  with  the  machine.  With  tne 
ever-expanding  use  of  computer  terminals  at  remote  locations,  the  test 
could  easily  be  set  up  to  be  administered  from  a  central  location.  This 
would  permit  the  ready  selection  of  a  test  "form"  from  among  several 
that  could  be  accessed,  and  scoring  and  performance  analysis  would  be 
almost  Instantaneously  provided  upon  completion  of  testing. 

A  specific  item  that  requires  improvement  over  the  set-up  used  In 
this  trial  administration  of  the  test  is  the  advance  procedure  in  the 
self-paced  mode.  In  this  trial,  the  subject  had  to  call  for  the  next 
stimulus  after  responding  by  pressing  a  button  on  the  side  of  the  re¬ 
sponse  unit.  As  a  result,  learning  times  for  the  self-paced  group  might 
have  been  slightly  biased  upwards. 

Another  feature  that  requires  investigation  is  whether  the  rein¬ 
forcement  should  be  given  by  a  signal  only  for  correct  choices.  That 
was  the  procedure  in  this  trial  administration.  The  learning  litera¬ 
ture  has  a  large  number  of  studies  that  have  Investigated  positive  rein¬ 
forcement,  negative  reinforcement,  both  positive  and  negative  reinforce¬ 
ment,  and  correction  vs.  noncorrection  methods— e.g.  Arima  {1965). 

There  is  a  good  likelihood  that  the  correction  method  might  be  best 
for  this  test.  That  Is,  the  next  stimulus  item  will  not  appear  until 
the  subject  presses  the  correct  button.  If  the  subject  has  initially 
chosen  the  incorrect  alternative,  he  or  she  must  press  the  correct 
button.  The  best  mode  should  be  determined  by  experimentation, 

SUMKARY  AND  CONCLUSION 

The  purpose  of  this  study  was  to  develop  a  test  of  learning 
ability  that  would  not  be  affected  by  the  cultural  background  of  the 
individual  being  tested.  A  test  was  created  using  randomly  shaped, 
2-dimenslonal  polygons  presented  In  pairs  in  a  discrimination  learning 
paradigm.  Three  different  lists  of  six  such  pairs  were  created  so  that 
multiple  discrimination  learning  was  Involved.  The  lists  were  presented 
individually  in  a  manner  similar  to  verbal  discrimination  learning  In 
both  a  self-paced  and  machine-paced  mode. 
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In  a  trial  administration  of  the  test  using  ftevy  recruits  as 
subjects,  significant  learning  took  place  over  JO  trials.  Nonwhita  awl 
white  racial  groups,  which  differed  significantly  on  their  Navy  General 
Classification  Test  Scores,  performed  at  a  comparable  level  in  the  self 
paced  mode.  Tf.e  adjusted  reliability  of  the  test  (split-half)  was  .R5. 
The  correlation  of  the  test  scores  with  the  GCT  scores  was  marginally 
significant  for  the  white  group  and  the  total  sample,  but  not  for  the 
nonwhite  group.  There  was  no  difference  in  performance  among  the  three 
lists,  which  differed  considerably  in  the  similarity  of  the  stimulus 
materials.  This  suggested  that  any  combination  of  the  forms  could  be 
used  to  create  equivalent  alternate  forms. 

It  was  concluded  that  a  practical  test  of  learning  ability  that 
was  culture  fair  to  both  the  white  and  nonwhite  croups  had  been  demon¬ 
strated.  Refinement  of  the  test  would  be  desirable  with  respect  to 
optimal  length,  reinforcement  procedure  (correction  vs.  noncorrection), 
and  the  physical  packaging  of  the  test. 
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Cutting  Scores— Legal  Implications 


Lawrence  S.  Buck 


Abstract 

A  vital  step  in  the  process  of  test  development  is  determining 
whether  the  test  in  question  requires  that  a  standard  be  set,  i.e., 
that  a  cutting  score  is  needed.  Cutting  scores  are  generally  considered 
essential  for  criterion-referenced  tests  (CRTs)  while  for  norm-referenced 
tests  (NRTs)  the  issue  is  less  clear.  Cutting  scores  cannot  always  be 
strongly  justified  on  psychometric  grounds  for  NRTs  but  are  often  jus¬ 
tified  on  other  grounds  such  as  legal  issues,  administrative  reasons 
and  probabilistic  terms.  Therefore,  both  CRTs  and  NRTs  will  usually  re¬ 
quire  the  setting  of  a  cutting  score. 

The  process  by  which  the  cutting  score  is  determined  as  well  as  the 
particular  choice  of  cutting  score  will  have  legal  implications  for  the 
test  as  a  whole.  As  employee  selection  practices  and  procedures  come 
under  increasing  scrutiny  and  challenge  in  the  courts,  issues  related  to 
the  determination  and  setting  of  cutting  scores  will  also  be  scrutinized. 

This  report  looks  at  some  of  the  legal  implications  of  setting  cutting 
scores  for  both  CRTs  and  NRTs  in  terms  of  professional  guidelines,  accepted 
practices  and  related  court  decisions.  Because  legal  precedence  is  one 
of  the  basic  foundations  of  American  law,  a  close  look  will  be  taken  at 
some  court  cases  that  have  at  least  in  part  been  concerned  with  cutting 
scores.  This  is  done  with  an  eye  towards  determining  if  or  what  legal 
precedence  exists  and  what  positions  the  various  courts  that  have  dealt 
with  cutting  score  issues  have  taken. 
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Cutting  Scorn* — Legal  Xapllcetion* 


A  vital  step  In  the  teat  development  process  1*  determining  whether 
the  teat  in  question  require*  that  a  standard  be  set,  that  is,  that  a 
cutting  score  is  needed.  The  issue  is  not  always  clear  as  to  whether 
cutting  scores  are  in  fact  necessary.  Vhile  cutting  scores  are  generally 
considered  to  be  essential  for  criterion-referenced  tests  (CRTs),  for  norm- 
referenced  tests  (NRTs)  the  issue  is  less  clear.  Cutting  scores  cannot 
always  be  strongly  justified  on  psychometric  grounds  for  HRTs.  With  MKTs, 
it  is  often  difficult  to  determine  from  the  number  of  test  questions 
answered  correctly  which  score  should  distinguish  qualified  applicants 
from  unqualified  applicants.  However,  the  use  of  cutting  scores  with 
NRTs  is  often  justified  on  other  grounds  such  as  legal  Issues,  administra¬ 
tive  reasons  and/or  probabilistic  terms.  In  situations  in  which  minimum 
standards  are  thoroughly  and  validly  determined  to  be  essential  to  success 
on  the  job,  as  is  often  the  case  with  CRTs,  little  issue  can  be  taken  with 
the  cutting  scores  selected  as  long  as  they  accurately  reflect  the  minimum 
standards.  Therefore,  one  usually  finds  cutting  scores  used  with  both  CRTs 
and  NRTs. 

The  term  "cutting  score"  is  often  assigned  s  vsrlety  of  meanings,  often 
depending  on  the  test  and  situation  in  which  a  score(s)  is  designated  to  be 
a  cutting  score(s).  For  purposes  of  this  report,  "cutting  score"  refers  to 
an  established  standard  and  is  considered  to  be  synonymous  with  other  terms 
such  as:  qualifying  score,  critical  score,  passing  point,  cutoff  score  and 
cut  score  when  these  term*  also  refer  to  an  established  standard.  Regard¬ 
less  of  the  way  in  which  one  chooses  to  define  cutting  scores  or  the  methods 
used  to  set  cutting  scores,  it  is  important  to  point  out  that  the  process  of 
selecting  or  setting  a  cutting  score(s)  is  not  totally  objective  for  either 
NRTs  or  CRTs.  As  Ebel  (1972)  so  aptly  points  out. 

Anyone  who  expects  to  discover  the  'real'  passing  score... 
is  doomed  to  disappointment,  for  a  ’real'  passing  score 
does  not  exist  to  be  discovered.  All  any  examining 
authority  that  must  set  passing  wcoves  can  hope  for,  and 
all  any  of  their  examinees  can  ask,  is  that  the  basis 
for  defining  the  passing  score  be  defined  clearly,  and 
that  the  definition  be  as  rational  as  possible,  (p.  496) 

While  there  will  always  be  a  need  for  some  human  judgment  in  the  process  of 
setting  a  cutting  score  there  are  many  aspects  of  the  process  that  can  be 
realistically  and  appropriately  determined.  Any  cutting  score(s)  used  must 
be  thoroughly  justified  and  documented. 
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It  is  essential  that  any  cutting  score (s)  used  must  be  defensible  as 
employen  selection  procedures  and  practices  have  come  under  increasing 
scrutiny  in  recent  years,  especially  in  regard  to  such  issues  as  test  bias, 
adverse  impact  and  discrimination.  Cutting  scores  (when  used)  are  a  vital, 
although  often  overlooked,  part  of  the  selection  process  and  as  such  are 
also  subject  to  careful  examination.  Many  employee  selection  practices 
have  been  and  are  presently  being  challenged  through  the  courts  by  minority 
groups.  Where  the  selection  process  is  at  least  in  part  based  on  normative 
standards,  the  choice  of  any  cutting  score (s),  no  matter  how  it  is  deter¬ 
mined,  may  be  susceptible  to  attack.  It  is  the  responsibility  of  the  test 
developer  to  ensure  that  any  cutting  score (s)  is  properly  documented  and 
supported. 

The  crucial  question  then  becomes,  how  does  the  test  developer  ensure 
that  a  cutting  score (s)  is  properly  and  adequately  supported?  Where  can 
test  developers  look  for  guidance  in  this  area  and  what  kinds  of  guidance 
can  they  expect  to  find?  What  are  the  legal  implications  of  setting  cutting 
scores  for  CRTs  and/or  NRTs  in  terms  of  professional  guidelines,  accepted 
practices  and  related  court  decisions?  Increasing  interaction  with  the 
legal  system  of  our  country  is  having  and  will  continue  to  have  a  great 
impact  or,  the  field  of  testing.  Legal  decision?  have  in  the  past  and  will 
in  the  future  play  a  role  in  the  manner  in  which  the  subject  of  cutting 
scores  is  treated.  A  basic  foundation  cf  American  law  is  the  concept  of 
precedence  and  it  is  to  the  courts  that  test  developers  will  look  for 
legal  guidance  in  the  use  of  cutting  scores.  In  addition,  the  test  devel¬ 
oper  must  also  look  to  professional  guidelines  for  advice  and  guidance  on 
cutting  score  issues. 

Before  delving  into  some  of  the  legal  decisions  that  impact  on  cutting 
score  issues,  let  us  first  look  at  cutting  scores  as  they  are  dealt  with  in 
some  professional  guidelines  and  regulations.  The  "Federal  Executive  Agency 
Guidelines  on  Employee  Selection  Procedures"  state  that: 

where  cutoff  scores  are  used,  they  should  normally 
be  set  so  as  to  be  reasonable  and  consistent  with 
normal  expectations  of  acceptable  proficiency  within 
the  work  force.  If  other  factors  are  used  in  deter¬ 
mining  cutoff  scores,  such  as  the  relationship  between 
the  number  of  vacancies  and  the  number  of  applicants, 
the  degree  of  adverse  impact  should  be  considered. 

(Federal  Executive  Agency  Guidelines  on  Employee 
Selection  Procedures,  published  simultaneously  by  the 
U.S.  Civil  Service  Commission,  the  U.S.  Department  of 
Justice  and  the  U.S,  Department  of  Labor,  1976,  p.  51793) 
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The  Equal  Employment  Opportunity  Comission’s  "Guidelines  on  Employee 
Selection  Procedures"  (1976)  treat  the  subject  of  cutting  scores  as  follows: 

...for  each  test  that  is  to  be  established  or  con¬ 
tinued  as  an  operational  employee  selection  instru¬ 
ment,  as  a  result  of  the  validation  study,  the 
minimum  acceptable  cutoff  (passing)  score  on  the 
test  must  be  reported.  It  is  expected  that  each 
operational  cutoff  score  will  be  reasonable  and 
consistent  with  normal  expectations  of  proficiency 
within  the  work  force  or  group  on  which  the  study 
was  conducted,  (p.  51985) 

In  addition, 


...where  a  test  is  valid  for  two  groups  but  one 
group  characteristically  obtains  higher  test 
scores  than  the  other  without  a  corresponding 
difference  in  job  performance,  cutoff  scores  must 
be  set  so  as  to  predict  the  same  probability  of 
job  success  in  both  groups,  (p.  51585) 

And,  in  discussing  continued  use  of  tests  which  ere  not  fully  supported 
by  the  required  evidence  of  validity: 

It  is  expected  also  that  the  person  may  have  to 
alter  or  suspend  the  cutoff  scores  so  that  score 
ranges  broad  enough  to  permit  the  identification 
of  criterion-related  validity  will  bo  obtained. 

(p.  51986) 

The  "Principles  for  the  Validation  and  Use  of  Personnel  Selection 
Procedures"  published  by  the  Division  of  Industrial-Organisational 
Psychology  American  Psychological  Association  (1975)  has  the  following  to 
say  about  cutting  scores  : 

If  cutting  scores  are  used  as  a  basis  for  decision 
(i.e.,  as  rigid  pass-fail  points)  the  rationale  or 
Justification  should  be  known  to  all  users.  This 
principle  does  not  recommend  cutting  scores.  Rather, 

•The  intent  is  to  recoenend  that  test  users  avoid 
the  practice  of  designating  purely  arbitrary  cutting 
scores  they  can  neither  explain  nor  defend,’  If 
cutting  scores  are  to  be  established,  some  consid¬ 
eration  should  be  given  to  the  different  effects 


of  different  cutting  scores;  e.g.,  the  effects  of  the 
two  kinds  of  error:  selecting  people  who  prove  un¬ 
satisfactory  as  opposed  to  rejecting  people  who  would 
have  been  satisfactory  if  hired,  (p.  Hi) 

The  "Standards  for  Educational  &  Psychological  Tests"  (19?U )  discuss 
cutting  scores  as  follows: 

If  specific  cutting  scores  are  to  be  used  as  a 
basis  for  decisions,  a  test  user  should  have  a 
rationale,  justification,  or  explanation  of  the 
cutting  scores  adopted,  ...The  test  user  should 
have  sons  justifiable  reason  for  the  adoption  of 
a  given  cutting  ecore.  ...This  standard  does  not 
attempt  to  recosonend  a  specific  procedure  for 
developing  cutting  scores  where  they  are  to  be 
used.  Tbe  intent  is  to  recommend  that  test  users 
avoid  the  practice  of  designating  purely  arbitrary 
cutting  scores  they  can  neither  explain  nor  defend. 

(pp.  66-67) 

The  Standards  suggest  a  variety  of  ways  of  selecting  a  cutting  score.  For 
example,  witK  content-referenced  interpretations  of  mastery  tests,  the 
cutting  score  "might  be  determined  as  the  obtained  score  at  which  one  can 
reject,  at  a  preselected  level  of  probability,  the  hypothesis  that  a  pre¬ 
designated  confidence  interval  for  that  score  Includes  the  perfect  score 
on  the  test"  (p.  66),  In  other  situations  the  cutting  score (s)  might  be 
based  on  "a  designated  probability  of  achieving  a  specified  level  of 
success;"  the  score  "tiut  will  maximise  the  discrimination  between  high- 
and  low-criterion  groups;"  or  "on  a  distribution  of  scores  in  a  'predicted- 
yield  '  situation"  (pp.  66-67). 

The  guidelines  and  regulations  that  are  quoted  above  do  not  give  a 
great  deal  of  coverage  to  the  cutting  score  issues.  However,  to  summarise 
the  main  points,  they  generally  stress  that  cutting  scores:  should  never 
be  purely  arbitrary;  should  be  reasonable  and  consistent  with  normal  ex¬ 
pectations  of  acceptable  proficiency  within  the  work  force;  and,  they 
should  be  Justified  and  documented. 

As  I  previously  pointed  out,  the  concept  of  precedence  is  a  basic 
foundation  of  American  law.  As  legal  challenges  increase,  test  developers 
must  look  to  the  courts  for  legal  guidance  on  cutting  score  issues.  A 
number  of  court  cases  are  reviewed  below  that  have,  at  lsast  in  part,  dealt 
with  cutting  score  issues.  This  is  done  with  an  eye  towards  determining  if 
the  various  court  decisions  have  delineated  any  clear  and  consistent  guide¬ 
lines  for  cutting  scores.  It  will  be  readily  apparent  that  there  is  not 
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a  great  deal  of  agreement  among  the  various  court  decisions  in  the  stances 
that  they  have  taken  towards  cutting  score  issues.  In  fact,  many  court 
decisions  seem  to  contradict  each  other.  In  many  court  cases  that  are 
relevant,  cutting  scores  have  been  discussed  only  in  very  general  terms. 

It  is  also  true,  and  unfortunately  so,  that  the  judicial  lawmaking  process 
tends  to  proceed  on  a  case-by-case  basis  with  little  or  no  input  from  any 
situations  which  have  not  become  involved  in  a  court  case.  No  one  can  be 
sure  what  the  courts  will  do  in  the  face  of  new  arguments.  At  this  point 
in  time,  we  are  unprepared  to  elucidate  a  selection  theory,  In  terms  of 
cutting  score  issues,  from  the  legal  decisions  that  have  to  date  been 
handed  down. 

At  the  present  time  one  of  the  mere  important  legal  decisions  in  th*; 
area  of  testing  is  the  U.S.  Supreme  Court  decirion  in  the  Griggs  v.  Duke 
Power  Company  case.  Footnote  11  of  this  case  states  in  pari  tnat : 

...an  employer  may  set  his  qualifications  as  high 
as  he  likes,  he  may  test  to  determine  which  appli¬ 
cants  have  these  qualifications,  and  he  my  hire, 
assign  and  promote  on  the  basis  of  test  performance. 

(p.  6135) 

In  some  cases  the  courts  have  held  cutting  scores  to  be  legal  if  the 
users  present  evidonce  that  the  appropriate  subject  matter  experts  endorse 
them.  For  example,  in  Tyler  v.  Vlkery,  a  case  in  which  the  passing  score 
for  a  oar  examination  was  challenged  the  cutting  score  was  evaluated  as 
follows  J 


...There  (in  Armstead  v.  Starkville  Municipal  Separate 
School  District,  L61F  2d  276  5th.  Cir.  1972)  we 
suggested  that  a  rationally  supportable  examination 
should  1)  be  designed  for  the  purpose  for  which  it 
is  being  used,  and  2)  utilise  a  cutoff  score  related 
to  the  quality  the  examination  purports  to  measure. 

Both  the  essay  and  the  MBE  portions  of  the  examina¬ 
tion  are  designed  solely  to  assess  the  legal  com¬ 
petence  of  bar  examinees  and  while  the  minimum 
passing  score  of  70  has  no  significance  standing 
alone,  it  represents  the  examiners  considered 
Judgments  as  to  'minimal  competence  required  to 
practice  law,'  the  precise  quality  the  exam  attempts 
to  measure. 

A  district  judge  in  Los  Angeles  ruling  on  height  and  physical  agility 
cutoffs  for  the  Los  Angeles  police  department  ruled  that  such  requirements 
do  not  violate  either  the  Constitution  or  Title  VII  of  the  Civil  Rights 
Act.  In  referring  to  evidence  in  the  form  of  criterion-related  validation 
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studies  the  court  stated! 

. . .A  stronger  and  taller  officer  can  more  quickly 
and  effectively  control  another  person  in  the 
event  of  a  confrontation.  This  officer  is  more 
likely  to  be  able  to  do  so  without  haring  to 
resort  to  extreme  fore*,  such  ss  the  use  of  a  gun 
or  a  dangerous  control  hold.  ...As  defendants’ 
affidavits  clearly  show  and  common  sense  confirms, 
physical  site,  strength,  and  agility  have  a  direct 
relation  to  the  quality  of  the  ferformanee  of 
these  functions.  ...These  qualities  are  essential 
to  an  effective,  efficient,  and  functional  police 
force. 

In  addition  the  court  stated: 

Since  the  tests  are  Job  related,  the  setting  of 
cut-off  scores  is  a  matter  for  the  employer's 
Judgment.  Many  factors  may  go  into  this  decision, 
such  as  the  number  of  applicante  available,  the 
coat  of  failures  in  training  and  on  the  Job,  the 
critical  nature  of  the  Job  to  be  performed,  and 
the  level  of  performance  at  which  the  employer 
desires  employees  to  perform. 

While  the  above  cases  were  generally  in  favor  of  the  use  o i 
cutting  scores,  the  following  court  cases  have  to  one  degree  or  another 
ruled  against  the  use  of  particular  cutting  scores.  For  example,  in 
Rogers  v.  International  Paper  Company  (1975),  cutting  scores  were  Judged 
to  be  too  high  in  that  Lo  percent  of  the  skilled  craftsmen  in  the  sample 
would  not  have  been  able  to  achieve  admission  to  their  respective  crafts 
under  this  standard.  In  Hiatt  v.  Berkeley  (1975),  ’-he  California  Superior 
Court  ruled  that  the  method  in  question  o?  using  tests  as  a  pass/fail 
standard  was  unjustified  and  arbitrary}  test-s  that  are  not  found  to  be 
Job-related  should  be  evaluated  on  the  basis  of  achievement.  In  U.S.  v. 
Central  Motor  Lines,  Inc.  (1971),  it  was  idled  that  the  determination  o# 
the  passing  score  was  at  the  subjective  discretion  of  the  employer  on  an 
individual  by  individual  basis  and  thus  the  test  was  unlawful. 

The  use  of  the  National  Teacher  Examination  as  a  licensing  examination 
for  public  school  teachers  in  North  Carolina  was  declared  unconstitutional 
in  U.S.  v.  State  of  North  Carolina  (1975).  The  basis  for  the  decision  was 
that  the  establishment  of  the  cutting  score  was  considered  to  be  arbitrary 
and  was  not  shown  to  be  a  measure  of  the  minimum  standard  for  the  teaching 
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profession.  While  the  court  did  not  find  anythin*  wrong  with  the  test  itself 
or  the  use  of  s  cutoff  score  per  se,  it  did  state  that: 

such  cutoff  score  shall  first  have  been  validated 
with  respect  to  minimum  academic  knowledge  an 
applicant  must  possess  in  order  to  beeone  a  reason¬ 
ably  adequate  and  competent  teacher  and  that  such 
score  be  shown  to  bear  a  rational  relationship  to 
teaching  capacity.  (Employment  Practices  Decisions, 

1975,  p.  1) 

Apparently  the  court  felt  that  the  state  had  selected  a  score  calculated  to 
produce  a  given  failure  percentage  for  its  cutting  score,  a  practice  which 
was  in  this  case  declared  unlawful. 

The  cases  cited  in  this  paper  are  neither  a  comprehensive  sampling  of 
all  cases  in  which  cutting  scores  are  at  least  an  issue,  nor  are  they  necessar^ 
ily  representative  of  all  such  cases.  However,  they  do  represent  some  im¬ 
portant  legal  caces  in  whien  cutting  scores  have  been  an  issue  and  as  such 
they  warrant  some  consideration.  In  addition,  they  rather  vividly  illustrate 
the  discrepant  decisions  and  philosophies  that  are  being  taken  in  regard  to 
cutting  score  Issues.  It  would  be  necessary  for  one  to  look  at  the  total 
picture  for  each  particular  case  before  claims  for  similarities  or  dissimi¬ 
larities  between  the  various  Court  decisions  could  be  made.  However,  from 
a  review  of  the  cited  cases  and  others  not  here  included,  it  seems  evident 
that  some  basic  philosophical  differences  do  exist  among  the  various  courts. 

One  crucial  issue  that  does  tend  to  stand  out  amidst  the  confusion  is 
the  requirement  stated  in  many  cases  that  test  users  must  demonstrate  the 
validity  and  job-re la tedneaa  of  their  tests.  For  the  moat  part,  while  tie 
various  courts  have  not  ruled  out  the  use  of  cutting  scores  per  se,  they 
hive  insisted  that  euch  scores  be  properly  documented  and  supported  in  terms 
of  job  relatednesa. 

It  is  heped  that  professional  teat  developers  will  in  the  future  have 
more  of  an  opportunity  to  provide  some  input  on  cutting  score  issues  to  the 
courts.  This  may  lead  to  mere  of  a  concensus  concerning  these  issues  and 
better  guidelines  and  regulations  to  follew  when  setting  cutting  ecorea. 
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An  Evaluation  of  Select  Approaches 
for  Biased  Item  Identification 


Problem 

Approximately  25  years  ago,  Eels  and  his  colleagues  conducted  what  appears 
to  be  the  first  serious  attempt  to  examine  test  items  for  bias  (Bells,  Davis, 
Ifevighurst,  Ifenrick  anu  Tyler ,  1951)  end  developed  one  of  the  first  ssssuiics 
purported  to  be  culture  fair.  ~>inoe  that  time,  the  entire  issue  of  cultural ' 
bias  in  measurement  has  became  heated,  ocrplex,  and  pronounced  in  the  litera¬ 
ture.  Actions  by  the  National  Association  of  Blade  Psychologists,  the 
American  Personnel  and  Guidance  Association  of  Blade  Psychologists,  the 
American  Personnel  and  Guidance  Association,  the  National  Education  Association, 
the  National  Association  for  the  Advancement  of  Colored  People,  the  National 
Association  of  Elementary  School  Principles  and  the  Council  of  the  Society 
for  the  Psychological  Study  of  Social  Issues  calling  for  moritoria  on  certain 
types  of  tests,  banning  tests,  and  requiring  alternative  plans  for  testing, 
indicate  the  serious  nature  of  the  current  situation  (see  Williams,  Moeby 
and  Rin sen,  197?) .  The  concern  is  also  apparent  in  recent  litigation  (DeFunis 
vs.  O&jgaard,  1974;  Diana  vs.  the  California  State  Board  of  Education,  1970; 
Hobeen  vs.  Hansen,  1967) .  fJaturally,  all  this  has  not  gone  unnoticed  by  those 
involved  in  the  measurement  field.  Bias  and  debiasing  studies  have  occurred 
and  various  models  been  proposed  in  ever-expanding  efforts  to  meet  the  chal¬ 
lenge  of  bias  in  educational  assessment. 

One  major  type  of  bias  investigation  is  concerned  with  the  ins^runent 
as  a  vholc  and  examines  the  question:  Does  a  test  unduly  favor  or  impede 
examinees  frem  different  parts  of  the  country  or  of  different  backgrounds? 
Another  is  concerned  with  the  items  within  a  test  and  asks:  Which  items  and 
item  formats  are  appropriate  for  a  given  population  and  which  may  be  used 
across  given  cultures? 

The  first  type  of  investigation  is  of  interest  to  the  test  users  %ho 
need  to  evaluate  the  appropriateness  of  the  test  information.  The  models 
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proposed  by  Cleary  (1968) ,  Thorndike  (1971) ,  Darlington  (1971) ,  Cole  (1973) , 

E inborn  and  Bass  (1971)  and  Gross  and  Su  (1975)  (also  see  the  entire  Spring 
1976  issue  of  the  Journal  of  Educational  Msasuranent)  exenplify  this  first  type 
of  investigation.  The  second  type  of  investigation  is  of  interest  to  developers 
as  it  assists  them  in  developing  valid  and  cross-culture  fair  items  and  provides 
a  framework  for  constructing  better  tests  in  subsequent  efforts.  By  identifying 
and  removing  such  items  from  an  initial  item  pool,  test  developers  could, 
theoretically,  develop  a  measure  free  of  bias.  The  work  of  Angoff  (1972) , 
Cardall  and  Coffman  (1964) ,  Green  and  Draper  (1972) ,  Men  (1973,  1976) , 

Rudner  (1977a) ,  Scheuneman  (1975)  and  Voale  and  Foreman  (1975,  1976)  (see  the 
review*  by  Merz,  1977  and  Rudner,  1977b)  have  been  directed  at  this  need.  It 
is  this  second  t ype  of  bias— item  bias— which  the  present  paper  addresses. 

Typically,  these  researchers  have  adopted  a  single  approach  and  used 
that  approach  exclusively  in  their  work.  As  a  result,  studies  applying  more 
than  one  approach  to  a  single  oat  of  data  have  been  sporce.  This  situation 
has  led  to  the  problem  identified  by  Merz  (1977)  and  addressed  by  this  study: 
the  psychometric  properties  of  the  approaches  have  not  been  fully  evaluated 
using  hvmthetioal  and  actual  item  response  data. 

Purpose 

The  purpose  of  this  study  was  to  investigate  the  following  four  approaches 
to  biased  item  identification  using  common  sets  of  actual  item  response  data: 

1.  Trans  formed  item  difficulties  in  which  within  group  p~ values  are 
standardized  and  compared  between  groups  (Angoff,  1972) ; 

2.  Chi-square  in  which  individual  items  are  investigated  in  terms  of 
between  groqp  score  level  differences  in  expected  and  observed 
proportions  of  correct  responses  (Scheuneman,  1975) : 
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3.  Item  characteristic  curve  theory  in  which  differences  in  the 
probabilities  of  a  correct  response  given  examinees  of  the  same 
underlying  ability  and  in  different  culture  groups  are  evaluated 
(Rudner,  1977a); 

4.  Factor  score  in  which  item  bias  is  investigated  in  terms  of  loadings 
on  biased  test  factors  (Merz,  1973). 

The  investigation  addresses  the  following  questions; 

1.  Do  the  select  approaches  provide  identical  classifications  of  items 
as  to  their  degree  of  aberrance  when  applied  to  item  response  data 
corresponding  to  two  culturally  different  populations? 

This  question  calls  for  a  comparison  of  the  approaches  as  they  would  typically 
be  applied  in  test  development  or  test  evaluation  studies. 

2.  Do  the  select  approaches  provide  classifications  of  minimal  bias 
when  applied  to  subsanples  of  a  single  population? 

This  question  is  similar  to  one  asked  by  Jensen  (1973)  and  serves  to  evalu¬ 
ate  the  adequacy  of  the  various  approaches.  Here,  an  approach  identifying 
an  abundance  of  items  as  biased  would  be  suspect  as  being  inadequate. 

The  Models 

Transformed  Item  Difficulties 

This  approach,  which  examines  the  interaction  of  item  and  groups, 
appears  to  be  one  of  the  best  known.  It  has  been  advocated  and  used  frequently 
by  Angoff  (1972;  and  Ford,  1973;  and  Modu,  1973)  and  others  (Green  and  Draper, 
1972;  Jensen,  1973;  Hicks,  Danlcn,  and  Hallmark,  1976;  Strassberg-aoeenberg 
and  Don  Ion,  1975;  Echtemacht,  1975;  Rudf»er,  1977c) . 

In  this  method,  p- values  for  a  group  of  items  are  obtained  for  two 
different  grtxps  of  examinees.  Each  p-value  is  converted  to  a  normal  deviate 
and  the  apris  of  normal  deviates,  one  pair  for  each  item,  are  plotted  on  a 
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bivariate  graph,  each  pair  represented  by  a  point  on  the  graph. 

•The  plot  will  generally  be  in  the  form  of  an  ellipse.  A  45>  degree  line, 
passing  through  the  origin,  provides  an  indication  of  the  absence  of  bias. 

Items  greatly  deviating  from  this  line  may  be  regarded  as  exhibiting  an  item 
by  group  interaction.  That  is,  relative  to  the  other  items,  deviant  items  ; 

are  especially  more  difficult  for  members  of  one  group  than  the  other. 

Assuring  both  groups  received  similar  instructions,  such  items  would  appear  to 
represent  different  psychological  meanings  for  the  two  groups  of  examinees. 

Since  tlie  intent  is  to  make  eonparisons  of  be  tween-group  differences  in 
item  difficulty,  it  is  necessary  to  transform  the  proportion  passing  an  item 
to  an  index  of  item  difficulty  which  constitutes  at  loast  an  interval  scale. 

This  is  accanplished  by  expressing  each  item  p-value  in  terras  of  within-group 
deviations  of  a  normal  curve  (see  Guilford,  1954,  pp.  418-419) . 

The  distance  of  an  item  point  to  the  line  can  be  treated  as  a  measure  of 
the  degree  of  item  bias.  One  can  determine  which  items  are  "greatly  deviating"  i 
from  the  line  by  incorporating  outlier  or  residual  analysis.  One  method  is  j 

to  place  confidence  limits  on  the  line  by  using  a  multiple  of  the  standard  ! 

j 

error  of  estimation.  An  alternate  approach,  adopted  by  StxassbergHtoseriberg 
and  Donlon  (1975)  and  Hicks,  et  al.,  (1976)  involves  oanputing  the  standard 
deviation  of  the  residuals  arid  classifying  as  biased  those  items  deviating 
by  greater  than  1.5  standard  deviation  units.  Rudner  (1977c)  has  employed 
a  fixed  item-regression  line  distance  of  .75  z- score  units. 


An  example  of  the  approach  is  shewn  in  Figure  1.  The  transformed  p-values 
have  a  correlation  of  approximately  .90,  caking  the  plot  relatively  long  and 
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fiat.  The  solid  line  represents  the  main  axis  and  the  dotted  lines  represent 
linear  confidence  limits.  The  item  represented  in  the  upper  left,  outside  the 
confidence  interval,  would  be  considered  biased. 


This  approach  to  biased  item  analysis  determines  whether  examinees  of 
the  same  ability  level  have  the  same  probability  of  a  correct  response  regard¬ 
less  of  cultural  affilation.  This  is  accomplished  by  dividing  the  tryout 
samples  into  groups  baaed  on  their  observed  score  and  comparing  the  proportions 


of  students  within  each  level  respcn 
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independent  observations  (Schememan,  1975,  1976;  Green  and  Draper,  1972) .  An 
item  is  considered  unbiased  if,  for  all  individuals  in  che  same  total  score 
interval,  the  proportion  of  correct  response  is  the  same  for  both  groups  under 
consideration.  A  modified  chi-square  test  determines  the  probability  that  an 
item  is  unbiased  by  this  definition. 

Scheuneman  (1976) ,  in  applying  the  approach  to  several  sets  of  data, 
advocates  using  four  or  five  total  score  levels  based  on  the  score  distribu¬ 
tion  of  the  smaller  sanple  (Green  and  Draper  had  used  within-group  quintiles) . 
Item  Characteristic  Curve  Theory 

Latent  trait  or  item  characteristic  curve  (icc)  theory  relates  the 
probability  of  a  correct  item  response  to  a  function  of  an  examinee's  underlying 
ability  level  ( o j)  and  characteristic (s)  of  the  item.  While  the  various  models 
(Lord,  1952;  Rasch,  1960;  Bimbaun,  1968;  Urry,  1970)  differ  in  terras  of  the 


number  of  item  parameters  considered;  they  all  describe  the  item  parameter  (s) 
independently  of  the  examined  sample.  Full  development  of  these  and  other 
mental  measurement  models  can  be  found  in  Hambleton  and  Cook  (1977) . 
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This  modem  measurement  theory  has  been  used  to  identify  biased  items 
(Green  and  Draper,  1972;  Pine,  1976;  Lord,  1977;  Rudner,  1977a).  In  an  early 
study.  Green  and  Draper  (1972)  had  used  observed  total  scores  as  estimates 
of  examinees'  abilities,  d^'s,  and  the  proportions  of  examinees  responding 
correctly  at  each  total  score  level  as  estimates  of  P(ua=l|e^) .  Their 
procedure  called  for  plotting  estimates  ice's  for  each  item  separately  for 
each  culture  group  and  ccrtparing  the  plots. 


By  this  and  other  latent  trait  theory  approaches,  an  item  is  unbiased  if 
examinees  of  the  same  ability  level,  but  of  different  cultural  affiliations, 
have  equal  probabilities  of  responding  correctly.  That  is,  an  item  is  unbiased 
if  the  estimated  ioc's  obtained  from  the  various  culture  groups  are  identical. 

As  an  exanple  of  a  biased  item,  consider  the  two  hypothetical  curves  shown  in 
Figure  2.  These  curves  are  based  on  responses  by  two  different  culture  groups 
to  the  same  item.  Total  observed  scores  are  used  as  estimates  at  Gi  and  pro¬ 
portions  of  examinees  responding  correctly  are  used  as  estimates  of  P  (u^«l  j  Oj.) . 
The  curves  are  not  identical,  since  the  location  parameters  for  the  two  curves 
are  not  equal.  Such  an  item  can  be  considered  biased  in  that  often  examinees 
of  the  same  ability  level,  e.g,  Xj  *  58%,  but  from  different  culture  groups, 
do  not  have  similar  proportions  of  correct  responses.  While  this  approach 
is  appealing,  total  observed  scores  arc  directly  incorporated  and  quantification 
of  the  degree  of  item  bias  is  difficult  (an  eyebilling  procedure  is  used  to 
identify  a  "very  biased  item") . 

Rather  than  using  total  observed  scores  as  estimates  of  0£  and  proportions 
as  estimates  for  P(u^>  1 1 Oj) ,  note  accurate  values  can  be  obtained  using  one  of 
the  recent  methods  of  parameterization  (Urry,  1975;  Winger  sky  and  lord,  1973) . 
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During  parameterization,  the  metric  used  for  the  8  scale  is  defined  by  the 
ability  variance  in  the  examined  sanple.  In  order  to  cacrpare  parameters 
obtained  from  two  different  examinee  groups,  the  obtained  values  must  be  equated. 
Lord  and  Novick  (1968,  Chapter  16. LI)  have  shown  that  this  can  be  aooooplished 
by  ocrputing  the  regressions  of  the  parameter  values  based  on  one  group  of 
examinees  on  the  parameter  values  based  cm  the  other  groip  of  examinees. 

Rudner  (1977a)  has  refined  the  procedure  used  by  Green  and  Draper  to 
identify  biased  items  by  incorporating  equated  icc  parameter  values.  The  area 
between  pairs  of  equated  ice's  is  U9ed  to  indicate  the  relative  amount  of 
aberrance  for  each  item  and  eyeballing  of  the  equated  ice's  is  employed  to  pro¬ 
vide  additional  information  as  to  the  nature  of  the  aberrance. 

Factor  Score 

In  factor  analysis,  underlying  factors  (i.o.,  dimensions  or  traits)  are 
hypothesized  and  the  correlations  of  each  variable  with  the  hypothesized  factors 
are  confuted.  In  an  achievement  test,  each  item  is  treated  as  a  variable. 

Such  an  analysis  could  be  conducted  twice  using  examinees  from  two  different 
cultural  backgrounds.  Ideally,  the  two  separate  groups  of  examinees  would 
yield  similar  sets  of  item- trait  correlations  (factor  leadings).  Different  sets 
of  factor  loadings  would  indicate  that  the  tvo  gnxps  are  not  responding  to  the 
items  in  the  same  manner.  Such  a  test  would  be  consider od  biased  in  that  it 
appears  to  measure  a  different  trait  across  groups.  The  items  exhibiting  the 
most  bias  would  then  be  those  with  the  largest  differences  in  factor  loading. 

Merz  (1973,  1976a)  has  suggested  an  approach  which  incorporates 
factor  scores  and  analysis  of  variance.  In  this  approach,  the  item  responses 
for  the  groups  are  corrbined,  factor  analyzed,  and  factor  scores  for  each  exam¬ 
ine  on  each  factor  conputed.  These  factor  scores  are  then  subjected  to  an 
analysis  of  variance,  with  group  marbership  being  the  independent  variable. 
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Where  significant  mean  differences  are  found  in  factor  scores,  the  factor  is 
classified  as  biased.  Biased  items  are  defined  as  those  with  high  factor 
loadings  on  a  biased  factor. 

METHOD 

Item  Sample 

The  1973  Stanford  Achievement  Test,  Form  A,  Primary  2  Battery, 

Beading  Comprehension  Subtest  (SAT) ,  —  which,  item  for  item  is  equivalent 
to  the  Stanford  AdrLevement  Test  -  Hearing  Impaired  Version,  Level  2,  Reading 
Comprehension  Subtest  —  formed  the  item  pool  for  use  in  this  study. 

The  SAT  consists  of  16  paragraphs  with  a  total  of  48  four-choice  items. 
According  to  the  test  publishers,  the  Psychological  Corporation,  reading  vocab¬ 
ulary  is  geared  to  the  primary  grade  levels  and  emphasis  is  placed  on  occpre- 
handing  disconnected  discourse.  It  was  anticipated  that  the  SAT  would  contain 
several  items  biased  in  favor  of  one  of  the  incorporated  culture  groq?  samples. 
Examinee  Samples 

Item  responses  made  by  large  samples  from  two  diverse  culture  groups 
were  used  in  the  study.  The  first  culture  group  was  composed  of  2,637  students 
in  programs  for  the  hearing  impaired  across  the  United  States.  The  scores  on 
the  SAT  for  this  group  were  approximately  normally  distributed  with  a  mean 
of  21.6  and  a  standard  deviation  of  7.42.  This  culture  group  was  divided 
into  two  subgroups  by  randomly  assigning  the  examinees  to  one  of  two  indepen¬ 
dent  groups  with  significantly  different  (pc.Ol)  mean  total  scores.  Both 
subgroups  were  approximately  normally  distributed.  The  first  subgroup  con¬ 
tained  1,079  examinees  with  a  mean  of  23.7  and  standard  deviation  of  7.43. 

The  second  subgroup  contained  1,030  examinees  with  a  mean  of  20.9  and  a 
standard  deviation  of  6.97.  Since  the  examinees  were  from  the  same  culture 
group,  the  expected  degree  of  aberrance  for  each  item  was  zero.  That  is. 
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the  approaches  were  expected  to  be  insensitive  to  the  differential  perfonranoe 
of  the  examinee  groups  and  consistently  identify  item  aberrance  as  minimal. 

The  second  culture  group,  representative  of  the  population  for  which 
the  SAT  was  designed,  was  oonposed  of  1,607  examinees  from  a  large  west  ooast 
public  school  system.  This  scores  on  the  SAT  for  this  hearing  group  were 
bimodally  distributed  with  modes  at  IS  and  44,  and  mean  of  28.9  and  12.44. 

One  major  difference  between  these  two  culture  groups  is  their  exposure 
to,  and  their  ability  to  use,  the  English  language  (see  Stoke,  1976  for  an 
excellent  discussion  on  the  social  and  cultural  characteristics  of  the 
hearing  impaired) .  Thus,  aside  from  cultural  differences,  the  two  groups  of 
examinees  greatly  differed  in  their  mean  level  of  ability  as  measured  by  total 
score  on  the  SAT. 

Procedures 

The  degree  of  bias  for  each  item  within  the  SAT  was  identified  by  applying 
a  select  approach  within  the  transformed  item  difficulties,  icc  theory,  factor 
score  and  chi-square  categories  to  item  responses  made  by  (1)  the  two  diverse 
culture  group  sanples,  and  (2)  two  equal  culture  group  samples. 

Each  item  bias  detection  approach  was  applied  to  item  responses  made  by 
these  culture  group  pairs  in  the  following  manner: 

transformed  item  difficulties  —  TYjo  sets  of  item  p- values  were  computed 
for  each  culture  group  pair  and  transformed  to  within  group  normal  deviates. 

From  the  bivariate  scatterplot  of  the  sets  of  transformed  p- values,  the  abso¬ 
lute  values  of  the  magnitudes  of  the  item  residuals,  i.e.  tlx*  itero-45  degree 
line  distances,  were  oenputed.  'Hus  residual  magnitude  served  to  indicate  the 
relative  amounts  of  item  bias. 

icc  theory  —  TWo  sets  of  item  icc  parameters  as  defined  by  Birnbaum’s 
three  parameter  logistic  model  were  estimated  for  each  of  the  SAT  items  by 
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separately  applying  the  Urry  (1975)  iterative  ndniraium  chi-square  procedure 
to  the  item  responses  of  each  of  the  two  culture  groups.  The  parameter 
value  estimates  were  then  equated  by  computing  the  between  group  linear 
regressions  for  the  difficulty  and  discrimination  parameters.  The  areas 
between  estimated  equated  ice's,  as  approximated  by: 

5. COO 

♦g  -  ™  l|  Pd^-llGi)  -  P-tv^-liQi)  il  A0i 

-5.000 

where  P(u9-l|ei)  and  P'tu^-ll^)  define  the  estimated 
equated  ice's 

and  AQ^  ®  .005 

served  to  indicate  the  extent  of  item  aberrant? . 

factor  score  —  The  item  responses  on  the  SAT  made  by  the  two  culture 
groups  within  cadi  pair  were  combined  and  inter-item  product-moment  correlations 
computed.  The  resultant  matrix  was  then  reduced  using  principal  component 
factor  analysis  with  an  eigenvalue  criterion  of  1.0.  The  factor  matrix  was 
rotated  orthogonally  (varimax)  to  simple  sf.-ucture  and  factor  scores  for  each 
examinee  on  each  factor  computed.  Separa's  t-tests  were  computed  using  each 
set  of  factor  scores  as  dependent  variables  and  group  membership  as  the  inde¬ 
pendent  variable .  Factors  for  which  there  were  significant  (p<.001)  differences 
between  mean  culture  group  factor  scores  were  classified  as  biased.  The 
magnitude  of  the  factor  loading  (X^)  on  such  factors  servrd  as  indicators  of 
the  magnitude  of  item  bias.  was  then  defined  as  the  maximum  item  factor 
loading  on  factors  classified  as  biased.  That  is, 

a  «  max  (x  .)  j  «  1,  2,  3  .  .  .  number  of  biased  factors 
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chi-cquare  —  Each  item  was  tested  individually  for  bias  using  a  modified 
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chi-square  technique  with  i  ~  2  culture  grocps  and  j  *  5  total  score  intervals. 
By  this  approach,  the  expected  values  for  each  coll  (E  )  were  obtained  by 
multiplying  (1)  the  proportion  of  all  examinees  with  total  scores  within  inter¬ 
val  j  responding  correctly  to  the  item  by  (2)  the  number  of  examinees  within 
the  cell.  That  is, 

Eij  "  li}  (Nij*  i  »  1,  2  j  =  1,  2,  3,  4,  5 

where  0. .  is  the  rnxrber  of  examinees  in  total  score  interval 
J  j  responding  correctly 

N.j  is  the  total  mrber  of  examinees  in  interval  j 

N^j  is  the  total  nurber  of  examinees  in  Group  i  and  score 
score  interval  j. 

As  with  a  conventional  chi-square,  observed  cell  valued  were  simply  the  number 
of  examinees  within  the  cell  responding  correctly  to  the  item.  For  each  item, 
the  magnitude  of  aberrance  was  indicated  (1)  by  the  value  of  the  resultant 
X2  and  (2)  by  one  minus  the  probability  associated  with  the  X2. 

Statistical  Analysis 

Statistical  and  graphic  analysis  were  conducted  to  obtain  a  global 
perspective  of  the  similarities  and  differences  among  the  methodologies.  Th.j 
following  analyses  were  employed: 

1.  Toe  relative  amount  of  similarity  between  pairs  of  approaches  was 
determined  by  respective  Poarson  Prodxrt-Momtyjt  correlations. 

2.  The  identified  degrees  of  bias  ware  compared,  item  by  item,  by 
examining  graphs  'r  which  items  are  represented  on  tlx*  abscissa  and  degree 
of  item  bias  on  the  ordinate. 

Results 

Diverse  Culture  Group  Ccnparison 

The  indices  of  aberrance  for  each  approach  to  biased  item  identification 
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Cor  the  diverse  culture  group  ocxtpariaon  are  given  in  liable  1.  In  the  IOC 
approach,  two  items,  21  arri  44,  could  not  be  parameterized  because  of  near 
zero  item-test  correlations,  and  hence  aould  not  be  evaluated.  Seven  factors 
with  eigenvalues  exceeding  unity  were  extracted  by  the  principal  components  . 
analysis  and  rotated  orthogonally.  Significant  differences  <£<.001)  between 
the  near,  factor  score  for  the  two  culture  groups  were  found  for  six  factors. 
Table  1  shows  the  maximum  factor  loading  for  each  item  on  one  of  these  six 
factors.  The  values  for  the  Transformed  Item  Difficulties  ranged  from  .04  to 
1.25, 


Because  of  the  dissimilar  total  score  distributions,  a  problem  was 
encountered  in  applying  the  chi-square  approach.  Initially,  five  observed 
score  intervals  were  defined  for  each  item  according  to  the  number  of  examinees 
in  the  hearing  sample  that  responded  correctly  to  the  item.  This  resulted  in 
highly  disproportionate  numbers  of  hearing  inpaired  examinees  in  each  interval. 
Mao,  defining  intervals  based  on  the  item  response  distributions  of  the 
hearing  impaired  examinees  resulted  in  highly  disproportionate  numbers  of  hearing 
examinees  in  cadi  interval.  A  compromise  was  achieved  by  averaging  the  pro¬ 
portions  of  examinees  responding  correctly  to  the  item  of  each  observed  score 
levels  across  groups,  and  rsing  four  intervals  instead  of  five. 

In  addition  to  using  the  X2  value  to  indicate  the  relative  amount  of 
aberrance,  one  minus  the  probability  associated  with  the  chi-square  was  used. 

Doth  indices  are  included  in  Table  1.  The  use  of  the  probability  value  as 
an  index  identified  56  percent  of  the  items  in  the  SAT  as  substantially  aberrant 
at  <l-pMl-.001}. 
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Tte  correlations  bet-nan  the  indices  of  aberrance  for  each  method  in 
the  diverse  culture  group  comparisons  are  given  in  Table  2.  The  chi-square  - 
ICC  (.67)  and  the  chi-square  -  transformed  item  difficulties  (.59)  correlations 
were  significant  at  p<.01.  All  correlations  involving  the  chi-square  and 
transformed  itan  difficulties  approaches  were  significant  indicating  some  degree 
of  similarity  between  each  of  these  approaches  and  the  other  models.  The 
factor  score  and  chi-square  (1— p)  approaches  showed  the  lowest  degree  of 
similarity  with  the  other  approaches.  The  average  correlation  of  each  of  these 
with  the  other  approaches  was  .29  and  .25,  respectively;  while  the  average 
correlation  with  other  approaches  for  the  chi-square  (X2) ,  transformed  item 
difficulties,  and  ICC  approaches  were  .48,  .37,  and  .36,  respectively. 

Equal -cvl tvre  Croup  Oortpariaon 

The  indices  of  aberrance  for  the  item  responses  in  the  equal -culture 
group  cxnparisons  for  each  approach  are  given  in  Table  3.  The  transformed 
item  difficulties  correlated  highly  (r  •  .98)  and  all  the  perpendixntLar  item 
main  axis  line  distances  were  minimal.  The  maximum  distance  was  .28.  No 
items  would  appear  to  bo  identified  as  biased  by  this  approach. 

In  the  icc  approach,  again  items  21  and  44  did  not  fit  the  model  and 
could  not  bo  evaluated.  Itens  28  and  39  showed  the  most  aberrance  with  values 
of  .51  and  .74,  respectively.  Doth  of  these  items  showed  less  aberrance  in 
the  diverse  culture  group  comparisons  indicating  possible  misclossification 
by  this  approach. 
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Fourteen  factors  with  eigenvalues  exceeding  unity  were  extracted  by  the 
principal  opponents  dialysis  and  rotated  orthogonally.  Significant  differences 
(p<.001)  between  the  maar.  factor  scores  for  the  two  equal-culture  groups  were 
found  for  three  factors.  The  maximun  factor  loading  for  items  on  the*»  three 
factors  ranged  between  .06  and  .72.  This  range  is  about  the  same  as  the 
range  noted  in  the  diverse  culture  group  corpariuons. 

Using  the  chi-square  approach,  five  total  score  intervals  were  defined 
baaed  on  the  average  proportions  of  examinees  responding  correctly.  The  chi- 
square  value;,  obtained  were  considerably  smaller  than  the  values  obtained  in 
the  diverse  culture  group  entparisezis,  and  no  items  would  have  been  classified 
as  aberrant  at  the  .05  level. 


Figure  3  gives  a  plot  of  the  aberrance  indices  for  each  item  for  each 
approach  in  the  diverse  culture  groqp  comparison  and  the  equal-culture  group 
comparison.  It  is  apparent  from  Figure  3  that  for  each  approach  the  variance 
of  aberrance  in  the  equal-culture  group  comparison  is  less  than  the  diverse 
culture  group  oorrparison.  In  the  euqal -culture  group  comparisons,  both  the 
factor  score  approach  and  the  chi-square  (1-p)  approach  appear  to  have  an 
undesirable  amount  of  variation. 

DISCUSSION 

The  diverse  culture  group  comparison  illustrated  the  approaches  as  they 
might  be  applied  in  actual  teat  development.  Large  ranters  of  examinjas  from 
two  different  populations  responded  to  a  pool  of  items  purported  to  measure 
the  same  ability  -  reading  comprehension.  Each  approach  identified  m  degree 
of  item  aberrance  for  each  item,  lie  results  show  that  there  was  soma  agrearaant 
in  terms  of  the  identified  degrees  of  aberrance  between  (1)  the  transformed 
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item  difficulties  and  chi-square  (na^nitude)  approaches  and  (2)  the  ioc  theory 
and  chi-square  (magnitude)  approaches,  although  the  agreement  was  not  over¬ 
whelming  (r  »  .59  and  r  *  .67,  respectively) .  One  minus  the  probabilities 
associated  *ath  the  X^’s  and  the  factor  score  approach  showed  little  agreertfcnt 
with  any  of  the  other  methodologies. 

Whether  the  identified  degrees  of  aberrant  are  in  agreement  has  little 

direct  meaning  in  test  development.  A  more  pertinent  question  is:  Do  the 

approaches  lead  to  the  same  decisions  with  regard  to  which  items  to  classify 

as  "very  biased"?  If  the  answer  were  in  the  affirmative,  the  most  appealing 

approach  would  be  the  simplest  one.  Table  4  illustrates  which  items  would 

be  classified  as  "very  biased”  by  the  icc  theory,  transformed  item  difficulties 

and  chi-square  (magnitude)  approaches  under  the  following  decision  rules: 

* 

(a)  icc  theory  -  area  >  .50 

(b)  transformed  item  difficulties  -  distance  .60 

(c)  chi -square  (magnitude)  -  X2  >  65.0 

These  decision  rules  wore  determined  by  identifying,  from  Figure  3  cut-points 
which  appear  to  define  outliers.  Since  the  variances  of  the  identified  degrees 
of  aberrance  for  the  factor  score  nod  chi-square  (probabilistic)  approaches 
were  small,  any  reasonable  cut-point  would  have  resulted  in  large  nurnberc  of 
items  being  classified  as  "very  biased"  thus  these  approaches  are  not  included 
in  the  table. 


I  torn  Table  4,  it  is  apparent  that  the  approaches,  under  these 
decision  rules,  would  have  ccntxnly  identified  items  16,  17,  and  22  as  "very 
biased."  TWo  approaches  would  have  identified  items  4,  15,  18,  26,  27,  30 
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and  45  as  being  biased.  Items  8,  23,  24,  25,  29,  44  and  47,  however,  were 
identified  by  only  one  approach.  More  conservative  or  more  liberal  decision 
rules  would  still  have  resulted  in  different  sets  of  items  being  identified. 

Since  there  is  some  disagreement  among  the  approaches,  the  results  of 
the  equal-culture  group  comparison  warrant  closer  examination.  The  two 
groups  of  examinees  in  this  comparison  were  from  the  same  well-defined  popu¬ 
lation;  namely,  students  with  a  hearing  loss  sufficient  enough  to  warrant  a 
special  educational  program.  As  such,  item  bias  between  these  two  groups 
is  by  definition  mininal,  and  the  expected  amounts  of  aberranoo  identified  for 
each  item  by  each  approach  is  assured  to  be  xero. 

Of  the  approaches,  only  the  transformed  item  difficulties  approach  fully 

met  this  criterion.  The  identified  degrees  of  aberrance  from  this  approach 

» 

were  small,  and  by  any  reasonable  decision  rule,  no  items  would  have  been 
classified  as  biased.  Thus,  the  model  behaved  as  expected.  The  identified 
degrees  of  item  aberrance  as  indicated  by  the  icc  theory  approach  were  also 
minimal.  However,  two  items  could  not  be  evaluated  and  two  items  would  have 
been  identified  as  having  fair  amounts  of  aberrance  under  a  liberal  decision 
rule. 

The  icc  theory  approach  unexpectedly  identified  items  28  and  39  as  con- 
twininq  fair  amounts  of  bias.  A  closer  examination  of  these  items  reveals  that 
their  latent  trait  item  difficulty  parameters  were  extreme  for  the  second 
group  of  examinees,  namely  2.77  and  3.91  respectively.  This  can  be  loosely 
interpreted  as  meaning  that,  ignoring  guessing,  an  ex-rvinoe's  ability  must  be 
2.77  (3.91)  standard  deviations  above  the  goxp  ne&n  ability  to  have  a  better 
than  average  chance  of  responding  correctly.  Since  relatively  few  examinees 
were  of  t hin  ability  level,  parameterization  became  tenuous  and  the  slight 
aberrance  in  these  items  is  probably  due  to  abnormally  high  parameterization 
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error.  Thus,  this  approach  is  liable  to  yield  spurious  results  when  item 
difficulty  is  extremely  high  or  low.  It  should  be  noted  that  the  nurber  of 
items  in  the  SAT  is  really  insufficient  for  a  proper  evaluation  of  the  ioc 
approach.  Fran  n  Monte  Carlo  investigation  of  the  Urry  parameterization 
procedure,  Schmidt  and  Gogol  (1975)  have  reccmnanded  that  a  minimum  of  60  items 
and  1,000  subjects  be  used  to  obtain  accurate  paraceter  estimates.  Since  the 
SAT  contains  only  45  items,  the  parameter  value  estimates  may  have  contained 
more  than  the  usual  amounts  of  error. 

Items  21  and  44  had  extremely  low  item-test:  point  biserial  correlations , 
which  impl.ved  that  ability  was  poorly  related  to  the  probability  of  a  correct 
response.  Such  items  cannot  fit  the  Bimbaum  model  and  banco  cannot  be  eval¬ 
uated  for  bias  with  the  ioc  theory  approach.  Although  such  items  ore  usually 
the  first  to  be  eliminated  in  tect  development,  the  fact  that  these  items 
cannot  be  evaluated  illustrates  a  weakness  in  the  approach. 

Tho  chi-aquare  approach  in  the  equal-culture  group  comparison  produced 

«* 

wide  fluctuations  in  the  probabilities  associated  with  the  X'-'s  used  to  test 
tho  null  hypothesis  of  no  bias.  However  at  p<.05,  (<l-p)>.951,  no  items  were 
suspected  as  being  biased.  Thus,  although  56  percent  of  the  intros 
were  identified  as  biased  in  tho  diverse-culture  group  oonparison,  in  terms 
of  the  equal-culture  groip  comparison,  the  chi-square  approach  appeared  to  be 
sufficient  when  either  probabilities  or  magnitudes  were  employed. 

The  factor  score  approach  identifies  aberrant  items  as  those  having  a 
major  loading  on  a  factor  which  yields  unequal  roan  factor  scores,  In  the 
equal-culture  group  oonparison,  three  sets  of  moan  factor  scores  were  identified 
as  unequal  at  conservative  values  (p<.001).  The  maximum  loadings  of  many  items 
on  these  factors  were  high,  several  being  higher  than  the  maximum  loading  in 
tie  diverse  culture  group  comparison.  The  approach,  as  applied  to  the  data 


555 


in  this  study,  produced  unsatisfactory  results  in  the  equal-culture  group 
conparison. 

The  above  discussion  has  pointed  out  that  there  were  differences  between 
the  approaches  in  the  identified  degrees  of  aberrance  in  both  the  diverse- 
culture  group  and  equal-culture  group  ooqparisons.  Of  the  methodologies, 
the  transformed  item  difficulties  and  icc  theory  approaches  appear  most 
attractive.  In  the  diverse-culture  group  oerpariaon  several  items  were  iden¬ 
tified  as  biased,  and  in  the  equal-culture  group  conparison,  the  identified 
degrees  of  .aberrance  were  minimal,  the  factor  score  approach  did  not  identify 
much  variance  in  item  bias  in  the  diverse-culture  group  oonoarison  and  yielded 
major  loadings  in  the  equal-culture  group  conparison.  Using  a  conservative 

probability  level  (pc.OOl)  the  chi-square  approach  identified  56  percent  of 

» 

the  items  as  biased  in  the  diverse  culture  group  conparison  and  yielded  wide 
fluctuations  in  the  anouit  of  aberrance  in  the  equal-culture  group  couparisons 

these  later  two  approaches  -  the  chi-square  approach  and  the  factor 
square  approach  -  both  incorporate  significance  besting  of  large  amounts  of 
data.  Tne  chi-square  approach  examines  the  hypothesis  that  the  proportions 
of  examinees  responding  correctly  are  identical  across  individuals  in  the 
same  observed  score  interval  and  of  different  cultural  classifications,  the 
factor  score  approach  incorporates  the  hypothesis  that  the  group  mean  factor 
scores  are  identical  across  the  defined  culture  groups  on  each  factor.  With 
samples  as  large  as  that  used  in  this  study,  hypothesis  testing  may  not  be 
appropriate.  The  sanple  values  are  such  that  they  can  bo  considered 
population  values  and  small  differences  are  statistically  significant. 

In  the  diverse-culture  group  conparison,  the  values  correlated  with 
the  distances  of  the  transformed  item  difficulties  approach  and  the  areas  of 
the  icc  theory  approach.  However,  their  magnitudes  were  extreme.  It  should 


be  noted  that  in  the  diverse  culture  group  comparison,  the  total  score  distri¬ 
butions  of  the  examinee  samples  were  quite  divergent.  In  the  equal-culture 
group  comparison,  the  distributions  were  not  as  different  and  the  X*  values 
were  substantially  less. 

The  chi-squrre  approach  analyzes  the  item  response  data  in  terns  of 
observed  score  intervals.  The  observed  value  for  an  interval  and  culture 
group  is  sitqply  the  number  of  examinees  in  the  interval  and  culture  group 
responding  correctly  to  the  item.  The  expected  value  for  a  culture  group  and 
interval  is  the  product  of  proportion  of  all  examinees  in  the  interval  respond¬ 
ing  correctly  to  the  item  and  the  number  of  examinees  in  the  culture  group  and 
in  the  interval.  Thus,  the  expected  value  will  be  influenced  by  the  culture 
group  with  the  greater  nurber  of  examinees  in  the  interval  when  the  observed 
score  distributions  are  different.  Since  the  item  interval  definitions  are 
often  similar,  this  will  result  in  a  near  systematic  inflation  of  the  X2 
values. 


An  exanple  of  how  total  score  distributions  affect  the  expected  interval 
values  (and  consequently  the  X2  values)  is  illustrated  by  the  hypothetical 
item  response  data  shewn  in  Table  5.  Here,  the  total  observed  score  distri¬ 
butions  are  quite  different.  3roip  1  has  more  than  five  times  as  many  examinees 
in  the  interval  as  does  Group  2.  Further,  the  total  ntsiber  of  examinees  at 
each  total  score  level  within  the  interval  decreases  as  total  score  increases 
for  Group  1  and  increases  for  Group  2.  however,  tlie  proportions  of  examinees 
responding  correctly  to  the  item  at  each  total  score  level  are  identical  aero? a 
groups.  That  is,,  the  tvo  groups  perform  identically  within  the  interval  and 
their  total  score  distributions  are  dissimilar .  If  the  approach  were  not 
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sensitive  to  total  score  distributions,  the  observed  and  expected  values  for 
each  group  would  be  identical .  However,  the  observed  and  expected  values  are: 

136  +  31 

for  groqp  1,  0],  *  136  and  »  3W  +  96  *  480  »  140.6,  and 

for  group  2,  O2  *  31  and  £3  «  136  +  31  *  90  «  26.4 

480  +  90 

Thus,  even  though  the  two  groups  performed  identically  at  each  total 
score  level,  the  observed  and  expected  values  are  unequal  and  would  have 
inflated  the  value.  Had  different  distributions  been  employed,  different 
expected  values  and  a  different  X  would  have  been  defined. 

The  factor  score  approach  entails  many  decision  points  which  will  affect 
the  results.  In  this  study,  phi -correlations  of  the  combined  data,  principal 
component  analysis,  eigenvalues  greater  them  1.0,  varimox  rotation,  and  prob¬ 
abilities  less  than  .001  were  used,  and  the  results  appeared  to  be  unsatis¬ 
factory.  In  the  diverse  culture  group  comparison  26  out  of  48  items  had  a 
maximum  factor  loading  of  .55  t  .10  on  a  factor  yielding  significantly  different 
moan  factor  scores,  and  the  identified  degrees  of  aberrance  in  the  equal -culture 
group  comparison  fluctuated  widely  with  several  items  being  identified  as  being 
more  aberrant  than  the  most  aberrant  item  in  the  diverse-culture  group 
comparison. 

The  factor  score  approach  attempts  to  identify  items  which  most  strongly 
measure  traits  in  which  the  groups  differ  significantly.  In  large  scale 
investigations,  groups  tire  likely  to  differ  on  any  measured  trait  including 
the  ones  intended  by  the  test  publisher  and  those  unintentionally  built  into 
the  test.  Thus,  a  significant,  difference  in  the  mean  factor  scores  on  the 
main  test  factor  may  be  of  little  interest.  Differences  on  other  factors, 
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however,  would  Indicate  the  presence  of  items  which  inappropriately  influence  group 
mean  scores.  In  order  to  identify  these  items,  the  underlying  factors  of 
the  test  must  be  well-defined  and  the  manor  factor  clearly  identified.  Prin¬ 
cipal  caiponent  analysis  using  eigenvalues  greater  than  one  and  varimax  rotation 
does  not  appear  to  allow  for  this.  Principal  oonponent  analysis  yields  factors 
which  are  defined  by  the  data  (as  opposed  to  inferred) ,  a  unity  eigenvalue  criteria 
does  not  guarantee  that  the  correct  nurtoer  of  factors  will  be  extracted  and 
varimax  rotation  cm  obfuscate  the  major  factor.  A  different  set  of  factor 
analytic  procedures  might  have  yielded  more  equitable  results. 

It  should  be  noted  that  the  factor  score  approach  incorporates  a  defini¬ 
tion  of  item  bias  which  is  substantially  di  fferent  than  the  other  approaches. 

The  approach  series  to  identify  items  which  measure  a  trait  other  than  that 
measured  by  the  renaming  items  of  the  test  (by  factor  analyzing  the  contained 
data)  and  heavily  contribute  to  differential  performance  (by  contributing  to 
differential  mean  factor  scores) .  Generically,  the  other  approaches  are  con¬ 
cerned  with  which  items  measure  different  traits  across  groups  and  operationally 
with  which  items  behave  differently  across  groups.  This  distinction  is  not 
as  subtle  as  it  nay  appear.  The  other  approaches  are  incapable  of  iden¬ 
tifying  items  which  measure  a  trait  other  than  that  gauged  by  the  other  items 
when  the  groups  perform  equitably. 

The  two  more  attractive  approaches,  the  transformed  item  difficulties 
and  the  icc  theory  approaches,  also  incorporate  different  operational 
definitions  of  bias.  The  transformed  item  difficulties  approach  identifies 
items  which,  relative  to  the  other  items  in  the  test,  are  more  difficult  for 
members  of  one  group  than  they  are  for  manbers  of  another  group  of  examinees. 

The  icc  theory  approach  identifies  items  for  which  examinees  of  the  same  true 
ability  and  from  different  population  groups  have  unequal  probabilities  of  a 
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correct  response.  Thus,  the  transformed  item  difficulties  approach  addresses 
aggregate  grotp  performance  as  indicated  by  item  p-values  and  the  ioc  theory 
approach  addresses  the  range  of  item  performance  along  the  abi)  ity  oontinuura 
as  indicated  by  item  characteristic  curves. 

The  difference  between  these  two  approaches  is  illustrated  by  items  25 
and  17  (in  Figure  4) .  In  the  diverse  culture  group  comparison,  item  25  was  iden¬ 
tified  as  biased  by  the  ioc  theory  approach  and  not  by  the  transformed  item 
difficulties  approach.  The  overall  difficulty  of  the  item  for  the  two  diverse- 
culture  groups  was  about  equal.  Consequently,  the  item  was  not  identified  by 
the  transformed  item  difficulties  approach.  However,  low  ability  hearing 
impaired  examinees  and  high  ability  hearing  examinees  are  favored.  That  is, 

when  considered  across  ability  levels  the  item  behaved  differently  between 
% 

groups.  Item  17,  which  was  identified  by  both  approaches,  does  not  show  this 
type  of  inverted  differential  performance.  Across  the  ability  oontii.uum- 
hearing  examinees  are  favored. 


When  catparing  the  transformed  item  difficulties  and  icc  theory  approaches 
in  terms  of  different  decision  rules,  five  items  were  conmonly  identified  by 
boch  approaches.  All  five  of  these  items  w're  of  this  latter  type  -  noninverted 
differential  performance  across  the  ability  continuum.  This  further  illus¬ 
trates  that  the  transformed  item  difficulties  approach  is  sensitive  to  dif¬ 
ferences  in  mean  item  difficulty  while  the  icc  theory  approach  appears  to  be 
sensitive  to  both  mean  item  difficulty  and  to  group  performance  along  the 
oentinuun.  However,  it  should  be  noted  that  different  definitions  of  item 
di  f f iculty,  and  hence  mean  grov  performance,  are  erployed.  The  transformed 
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item  difficulties  approach  directly  defines  ,%tem  difficulty  from  the  aggregate 
data.  The  ice  theory  approach  infers  item  difficulty  from  performance  on  the 
item  alone.  Since  these  different  definitions  are  employed,  different  items 
were  identified  as  being  biased  against  a  group  as  a  whole. 

Conclusions 

Based  on  the  two  applications,  the  factor  score  and  chi-square  approaches 
appeared  to  be  inadequate  for  identifying  biased  items.  The  X2  values  in  the 
chi-square  approach  were  shown  to  become  inflated  as  total  observed  score 
distributions  differ,  thus  leading  to  erroneous  classifications  of  bias.  The 
factor  score  approach,  which  incorporates  a  somewhat  different  definition  of 
bias,  identified  large  degrees  of  aberrance  in  the  equal-culture  group  com¬ 
parison.  It  was  felt  that  the  decisions  used  in  factor  analyzing  the  data  led 
* 

to  the  unsatisfactory  results.  It  was  further  noted  that  both  of  these 
approaches  employed  inference  testing  which  may  not  be  appropriate  with  the 
large  sample  si«es  used  in  this  study. 

The  transformed  item  difficulties  and  the  icc  theory  approaches  appeared 
to  be  most  promising.  Tiv  identified  degrees  of  aberrance  in  the  equal- 
culture  group  was  consistently  low  for  both  approaches,  although  a  liberal 
decision  rule  would  have  led  to  the  false  identification  of  one  or  two  items 
by  the  icc  theory  approach.  The  two  approaches  identified  several  items  in 
common  in  the  diverse  culture  grovp  comparison.  The  major  differaee  between 
these  two  methodologies  is  that  the  icc  theory  approach  appears  to  be  sensi¬ 
tive  to  bias  against  both  individuals  and  groups  of  examinees  and  the  trans¬ 
formed  item  difficulties  approach  appears  to  be  sensitive  to  bias  only  against 
groups. 

ReoociTrcndations 

Tie  investigation  utilized  a  single  set  of  diverse  culture  group  data 
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for  which  the  item  parameters  were  unknown  a  priori.  While  there  was  substantial 
reason  to  suspect  the  presence  of  seme  biased  items,  the  true  nurrber  of  biased 
items,  their  amounts  of  aberrance  and  their  item  numbers  were  unknown  A 
similar  study  using  simulated  data  with  known  parameters  may  prove  revealing. 
Such  a  study  could  also  investigate  the  behavior  of  the  approaches  under  dif¬ 
ferent  numbers  of  biased  items. 

One  of  the  more  promising  and  interesting  approaches  to  the  detection  of 
biased  items,  the  distr  actor  response  analysis  (Veale  and  Foreman,  1975,  1976; 
Maw,  1977) ,  was  not  evaluated  in  this  study  -  due  to  the  lack  of  the  appro¬ 
priate  item  response  data.  Rather  than  analysing  the  numbers  of  examinees 
responding  correctly,  this  approach  identifies  differences  in  distxactor 
response  patterns.  Although  the  approach  incorporates  inference  testing,  it 
may  prove  beneficial  to  the  field  and  should  be  considered  in  future  investi¬ 
gations  of  item  bias  detection  methodologies. 
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TABLE  1 


Degrees  of  Aberrance  Identified  by  the  Xr>r>rn*«hmm 
in  the  Diverse-Culture  Greu^Co^riSon 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 

45 

46 

47 

48 


.40 
.07 
<29 
.75 
.25 
.17 
.15 
.50 
.27 
.24 
.34 
.37 
.11 
.16 
.25 
.57 
.76 
.  <83 
.37 
.16 

2.30 

.38 

*61 

1.01 

.38 

.04 

.32 

.29 

.23 

.13 

.19 

.14 

.15 

.14 

.09 

.07 

.14 

.23 

.08 

.27 

.27 

.07 

.55 

.25 

.60 

.34 


Transformed 
Item 


.24 
.31 
.13 
.79 
.21 
.18 
.43 
.54 
.14 
.46 
.54 
•  52 
.52 
.05 
.68 
1.11 
1.25 
.85 
.23 
.18 
.44 
.67 
.67 
.51 
.08 
.67 
.76 
.18 
.44 
1.05 
.07 
.01 
•  15 
.05 
.66 
.17 
.32 
.43 
.14 
.37 
.16 
.33 
.16 
.26 
.04 
.16 
.21 
.24 


Chi 

quare 

(1-p) 

Chi 

Square 
(  X*  ) 

.98 

5.9 

.999 

33.1 

.87 

8.5 

•  999 

54.2 

.99 

11.1 

•  89 

6.2 

.99 

11.9 

.999 

27.9 

.99 

12.6 

.99 

11.1 

.999 

42.8 

.999 

43.6 

.999 

55.1 

.60 

.999 

.999 

.999 

.999 

.999 

.99 

.99 

.999 

.999 

.98 

.999 

.999 

.999 

.96 

.999 

.999 

.999 

.65 

.99 

.96 

.999 

.999 

.18 

.999 

.99 

.999 

.60 

.60 

.999 

.999 

*999 

.99 

.88 

.999 


3.0 

105.4 

107.7 

155.0 

27.7 

30.7 

14.8 

14.4 
240.9 

31.8 

10.2 

49.5 

94.8 
65.2 

8.2 

65.4 
122.3 

26.0 

4.2 

13.7 

8.2 

17.7 

33.6 
.9 

34.7 

15.1 

23.4 

2.8 

2.9 

22.8 

133.2 

85.1 

13.4 

6.1 

33.1 


Factor 


*35 

.53 

.55 

.45 

.61 

•  40 
.45 
.46 
.35 
^42 

•  62 
•  60 

•  52 

•  28 
•  42 
.61 
.65 
.26 
.30 
.36 
.56 
.52 
.23 
.53 
.57 
.60 
.48 
.55 
.34 
.52 
.36 
.27 
.44 
.17 
.33 
.22 
.26 
.20 

•  36 
.44 
.51 
.46 
.46 
.48 
.49 
.51 
.57 
.44 
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Table  2 

A 

Correlations  of  the  Degrees  of  Aberrance  Identified 
by  the  Approaches  in  the  Diverse  Culture  Group  Comparison 


Transformed 
item  difficulties 

Chi-Sguara 

(X2) 

Chi-Square 

(1-P) 

Factor 

score 

Icc  theory  .31* 

.67** 

.17 

.28 

Transformed  item 
difficulties 

.59** 

.29*  ' 

* 

.3Q* 

Chi-square 

(X2) 

.31* 

.34* 

Chi-square 

(1-p) 

\ 

.23 

*  *  p  ,05 

••  p  .01 


>« 


f 


>  ^  x,„  ^  v  „  „  _  „,^__r 

TABLE  3  ‘  . .  • 

Decrees  of  Aberrance  Identified  by  the  Approaches 
in  the  Equal-Culture  Group  Comparison4 


Item 

ICC 

Transformed 

Chi- 

Chi- 

Factor 

# 

Area 

Item 

Square 

Square 

Score 

, 

Difficulties 

(1-p) 

(  X2  ) 

1 

.12 

.02 

.32 

2.4 

.19 

2 

.15 

.02 

.22 

1.7 

.07 

3 

.10 

.16 

.05 

.5 

.16 

4 

.06 

.06 

.32 

2.4 

.36 

5 

.08 

.18 

.01 

.1 

.07 

6 

.28 

.14 

.48 

3.3 

.06 

7 

.24 

.09 

.08 

.9 

.26 

8 

.19 

.03 

.01 

.2 

.02 

9 

.19 

.08 

.52 

3.4 

.32 

10 

.08 

.02 

.28 

2.1 

.09 

11 

.18 

.00 

.03 

.5 

.19 

12 

.17 

.11 

.28 

2.1 

.14 

13 

.04 

.13 

.01 

.2 

.19 

14 

.21 

.12 

.12 

1.2 

•  20 

15 

.04 

.07 

.18 

1.6 

.26 

16 

.22 

.03 

.40 

2.6 

.13 

17 

.31 

.15 

.48 

3.3 

.20 

18 

..26 

.07 

.08 

.9 

.57 

19 

.32 

.03 

.68 

4.8 

.20 

20 

.24 

.04 

.15 

1.4 

.46 

21 

«• 

.28 

.68 

4.7 

.11 

22 

.17 

.05 

.40 

2.6 

.06 

23 

.34 

.14 

.06 

.7 

.15 

24 

.19 

.21 

.09 

1.0 

.20 

25 

.36 

.09 

.68 

4.8 

.08 

26 

.21 

.01 

.03 

•  6 

.17 

27 

.11 

.02 

.07 

.3 

.40 

28 

.51 

.16 

.59 

3.8 

.14 

29 

.11 

.09 

.26 

2.0 

.40 

30 

.14 

.14 

.53 

3.7 

•  19 

31 

.09 

.10 

.12 

1.1 

.14 

32 

.07 

.03 

.31 

2.3 

.24 

33 

.34 

.12 

.78 

5.6 

.25 

34 

.14 

.13 

.20 

1.7 

.72 

35 

.12 

.21 

.73 

5.3 

.70 

36 

.22 

.18 

.07 

•  8 

.72 

37 

.06 

.15 

.26 

2.1 

.63 

38 

.23 

.09 

.48 

3.3 

.34 

39 

.74 

.16 

.88 

7.6 

.10 

40 

.38 

.06 

.47 

3.2 

.20 

41 

.35 

.14 

.81 

6.5 

.08 

42 

.37 

.05 

.52 

3.5 

.11 

43 

.29 

.08 

.12 

1.2 

.11 

44 

- 

.12 

.31 

2.3 

.08 

45 

.34 

.06 

.48 

3.4 

•  48 

46 

.14 

.10 

.07 

.8 

.51 

47 

.32 

.07 

.08 

.9 

.68 

48 

.26 

.16 

.68 

4.8 

.43 
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f 

t 
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TABLE  4 


r-*. 


» 

I 

i 

I 


* 


Items  classified  as  biased  (***;  by 
three  approaches  under  select  decison 
rules  in  the  diverse-culture  group  comparison 
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Hypothetical  Ifceu  Response  Distributions  by  Total 
Score  Levels  Within  a  Single  Interval 
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Figure  2:  Two  hypothetical  response  distributions 
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Figure  4:  Estimated  equated  icc*s  for  items  17  and  25 
in  the  diverse-culture  group  comparison. 
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A  STUDY  OF  THE  UTILITY  OF  LATENT  TRAIT  SCORING  FOR 
MEASURING  THE  CULTURE-FAIRNESS  01  TESTS 


By 

Charles  H.  Cory 

Navy  Personnel  Research  and  Development  Center 


Personnel  with  submarginal  literacy  have  been  a  continuing  problem  to  the  Navy 
and  over  the  years  have  been  processed  by  means  of  a  variety  of  special  programs  de¬ 
signed  to  identify  and  either  upgrade  or  eliminate  them.  At  the  present  time  the 
Navy  is  administering  a  reading  comprehension  test  early  on  in  Boot  Camp  to  j6out  30 
percent  of  incoming  recruits  in  order  to  identify  the  small  percentage  of  submarginal, 
readers  in  the  enlisted  input.  On  the  basis  of  the  test  results,  recruits  who  are  found 
to  have  a  reading  grade  level  (RGL)  below  3.0  are  given  administrative  discharges; 
those  having  RGLs  between  3.0  and  5.4,  inclusive,  are  assigned  to  Academic  Remedial 
Training;  and  those  with  RGLs  of  5.5  or  better  are  assigned  directly  to  Recruit  Train¬ 
ing. 

Heading  Grade  Level  is  a  good  predictor  of  recruit  attrition  versus  non-attrition 
for  personnel  who  are  reading  in  the  submarginal  range.  Studies  at  the  Navy  Personnel 
Research  and  Development  Center  (NPRDC)  have  found  that  64  percent  of  recruits  with 
RGLs  less  than  4.0  attrite  prior  to  completion  of  Recruit  Training.  This  compares 
with  20  percent  Boot  Camp  attrition  for  personnel  reading  in  the  4. 0-5. 9  RCL  range  and 
10  percent  attrition  for  personnel  reading  in  the  6. 0-7.9  RCL  range.  Recently  an  eff¬ 
ort  was  undertaken  to  determine  whetSter  the  relationship  of  reading  grade  level  to 
attrition  from  Recruit  Training  was  substantially  the  same  or  whether  it  was  substant¬ 
ially  different  for  Blacks  and  Whites. 

To  examine  this  question,  records  for  all  Whites  and  all  Blacks,  were  extracted 
from  a  large,  full-;«nge  sample  of  incoming  recruits  who  had  been  given  the  Catee- 
KacGlnitle  Test  of  treading  Comprehension,  Survey  D,  grades  4-6.  TWo-by-two  tables, 
formed  separately  for  the  White  and  Black  aubsamples  were  designed  to  show  the  dispo¬ 
sition  of  personnel  with  RGLs  less  than  4.0  and  greater  than  or  equal  to  4.0  relative 
to  attrltion/nonattrltion  in  Recruit  Training. 

Table  1  shows  these  data  uf  White*.  You  can  see  that  reading  score  has  e  moderat¬ 
ely  high  correlation  with  non-attrition  in  Recruit  Training.  As  would  be  expected,  the 
great  majority  of  the  group  reading  above  the  fourth  grade  level  did  not  (.ttrite  in 
Recruit  Training  and  the  phi  coefficient  of  .29  Indicates  that  a  moderately  high  posi¬ 
tive  relationship  existed  between  reading  score  and  non-attrition  in  Recruit  Training. 
From  the  left  hand  column  of  Table  1  it  is  apparent  that  more  Whites  with  RGLs  below 
4.0  attrited  in  Recruit  Training  than  did  not  attrite  in  Recruit  Training. 

As  is  shown  in  Table  2,  for  Blacks  the  relationship  between  reading  score  end  attri¬ 
tion  in  Recruit  Training  was  somewhat  less  than  for  Whites,  (phi  coefficient  of  .16 
compared  with  .29)  but  it  was  still  significantly  different  from  sero  at  p<  .001. 
Correlations  of  these  magnitudes  in  the  two  racial  groups  appear  to  meet  the  require¬ 
ments  for  unbiased  tests  which  have  been  specified  in  the  proposed  Uniform  Guidelines 
on  Employee  Selection  Procedures  of  the  Equal  Employment  Opportunity  Coordinating 
Council. 

However,  from  the  left  hand  column  of  Table  2,  it  can  be  seen  that  about  one 
Black  in  four  with  less  than  a  fourth  gr ade^ reading  level  actually  attrited  from 
R>cruit  Training.  In  fact,  only  26  percent' of  Blacks  compared  with  57  percent  of 
Whites  with  RGLs  below  4.0  attrited  in  Recruit  Training.  These  differences  are  dist¬ 
urbing,  and  they  indicate  that  subtle  biases,  particularly  at  the  item  level,  may 
be  present  in  the  test.  Accordingly,  It  was  decided  to  investigate  if  e  more 
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sophisticated  scoring  technique  using  a  latent  trait  model  could  detect  bias  in  the 
teat.  The  study  which  was  undertaken  in  this  connection  had  the  objectives  of  (1) 
detecting  item  bias  in  the  test  if  it  existed  and  measuring  its  aaguitude  and  (2) 
evaluating  the  utility  of  a  latent  trait  aodel  for  reducing  the  differences  In  the 
screen  rates  of  Blacks  and  mutes. 

Latent  trait  or  its*  characteristic  curve (ICC)  node la,  relate  the  probability 
of  success  on  an  item  to  a  function  of  the  trait  being  measured  and  to  the  character* 
istics  of  the  Item.  A  frequently  used  aodel  for  ICCs  and  one  which  was  used  for  the 
present  study  is  a  3-parameter  logistic  aodel.  This  aodel  assumes,  that  (1)  *n 
examinee's  probability  of  responding  correctly  on  an  itea  is  unaffected  by  his  response 
on  any  other  itea  (i.e.  the  items  have  local  independence)  and  (2)  that  all  items  In 
the  test  measure  a  single  trait. 

Figure  1,  an  example  of  an  ICC  for  a  multiple  choice  itea  with  five  alternatives, 
relates  the  probability  of  answering  tne  itea  correctly  (on  the  ordinate)  to  the  ability 
of  the  examinee,  shown  on  the  abscissa.  Ability  ie  normally  denominatad  by  tha  Grtak 
letter,  8. 

You  can  eee  that  the  probability  of  a  correct  response  it  a  monotonic  function  of 
8.  At  sufficiently  low  levels  of  ability,  represented  by  1  1/2  standard  deviations 
below  the  mean,  the  probability  of  answering  this  item  corractly  is  essentially  at  a 
chance  level.  In  contrast,  for  very  high  levels  of  ability,  the  probability  of  gattlng 
the  Item  correct  approaches  unity. 

A  formula  for  the  3-psrsmecer  logistic  model,  which  would  be  used  to  determine 
the  probability  of  a  correct  response  for  any  given  8  to  item  1,  is  shown  st  tha  bottom 
of  Figure  1.  The  term  P.8  is  the  probability  that  an  individual  with  ability  8  will 
correctly  answer  item  i;  D  is  a  constant  scaling  factor,  and  a.,  b,,  and  c.  art  para- 
maters  of  1  which  together  describe  the  relationship  between  ability  level1 and  prob¬ 
ability  of  a  correct  response. 

b.,  the  item  difficulty  parameter,  Indicates  the  level  of  0  which,  excluding  the 
effects  of  guessing,  has  exactly  a  50  pa. cent  probability  of  answering  i  correctly. 

For  the  curve  in  Figure  1,  b.  would  he  shout  0.  a.  is  proportional  to  the  slope  of 
the  item  chsrscteristic  curve  measured  at  the  point  on  the  ICC  which  is  directly  above 
b..  It  indicates  the  rapidity  at  which  the  probability  of  a  correct  response  changes 
with  changes  in  ability  level.  Since  it  measures  the  ability  of  the  item  to  discrim¬ 
inate  aaiong  levels.  It  is  called  the  discrimination  parameter,  c.  is  the  lower 
asymptote  of  the  ICC  for  item  1  end  represents  the  probability  that  s  person  having 
no  knowledge  will  guess  the  correct  answer  to  t.  Notice  that  c.,  the  empirical  prob¬ 
ability  of  guessing  correctly,  may  or  may  not  be  the  same  as  the  theoretical  probabil¬ 
ity  of  guessing  the  correct  answer.  For  a  5-choice  Item  shown, the  theoretical  prob¬ 
ability  of  guessing  correctly  is  the  reciprocal  of  the  nuabor  of  choices  for  the  item, 
or  .20.  The  item  parameters  a.,  b.,  and  c.  and  the  ability  parameter,  8,  are  measured 
on  the  saae  scale  and  are  typically  expressed  as  standard  scores. 

Extensive  research  has  been  conducted  with  latent  trait  models  for  scoring  tests. 
Tests  crcaced  using  these  models  have  been  found  to  produce  substantial  savings  in 
administration  time  and/or  to  be  associated  with  greater  accuracy  of  maasuremant  than 
similar  tests  which  are  administered  using  traditional  formats  and  methods  of  scaling. 

The  ability  to  develop  tests  which  are  free  of  Item  bias  has  also  been  claimed  as 
an  advantage  of  latent  trait  scoring  methods.  In  fact,  Petersen  (1977)  and  Lord 
(1977)  among  others,  have  pointed  out  that  latent  trait  techniques  represent  the 
only  adequate  means  for  evaluating  bias  at  the  iteu  level.  In  the  latent  trait  model 
an  item  would  be  considered  unbiased  if  the  probability  of  getting  it  correct  is  the 
same  for  all  examinees  of  a  given  ability,*  regardless  of  group  membership. 

Despite  the  theoretical  attractiveness  of  the  latent  trait  appreach,  very  few 
studies  of  Item  bias  have  made  use  of  .this  technique.  Furthermore,  the  major  studies 
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u«ing  latent  trait  scaling  to  evaluate  item  bias  have  either  employed  simulated  data  5 
to  test  the  characteristics  of  the  model  (Pine  4  Weiss ,  1916)  ,  or  have  been  focussed  oh 
identifying  items  whose  ICCs  for  Blacks  and  Whites  were  statistically  identical,  except  * 
for  sampling  variation  (Lord,  1977).  There  appear  to  be  no  studies  in  the  literature 
which  consider  the  differences  in  ICCs  of  racial  groups  in  relation  to  scores  on  ex-  ' 
tenia)  criteria. 

The  Navy  Personnel  Research  and  Development  Center  haa  a  computer  program 
developed  by  Dr.  Urry  of  the  U.  S.  Civil  Service  Coomuasion  which  uses  e  3-parameter 
logistic  model  to  scale  teat  items  and  to  compute  theta  valuea.  tor  the  present 
research  it  was  decided  to  apply  this  program  to  samples  of  Blacks  and  Whites 
extracted  from  a  full  range  sample  of  more  than  30,000  Incoming  recruits  who  had 
been  administered  the  Cates-HacClnitle  Reading  Test  (CM),  Survey  D,  grades  4-6  in 
connection  with  research  conducted  by  Dr.  Duffy  of  NPRDC.  Records  of  personnel  in 
this  data  base  also  contain  a  code  for  attrlted/non  attrited  in  Recruit  Training, 
the  AT  code. 

Complete  sample*  of  Blacks  and  Whites  who  took  form  2  of  the  CM  were  extracted 
from  the  data  base  and  were  used  to  create  two  separate  Black-White  data  aata.  Tht 
Truncated  (TR)  data  set  consisted  of  personnel  who  had  scores  of  50  or  lower  on  the 
General  Classification  Test,  a  test  of  verbal  ability.  In  othar  words,  the  TR  file  was 
composed  of  personnel  having  verbal  abllitiaa  In  a  range  that,  in  the  Navy,  is 
routinely  screened  for  reading  ability.  All  blacks  and  avery  seventh  White  having 
GCT  scores  in  this  range  were  Included  in  tho  TR  data  set— a  total  of  680  Blacks  and 
733  Whites. 

The  Full  Range  (FR)  data  set  consisted  of  personnel  having  GCT  acoras  across  thl 
entira  range.  Thus  it  included  the  total  Black  sample  and  every  twelfth  case  in  the 
total  White  sample,  a  total  of  951  Blacks  and  993  Whites.  For  the  TR  and  FR  data 
aats  statistics  on  the  Cates  MacGtnltie  teats  are  shown  in  the  next  table. 

In  can  be  seen  from  nne  table  that  the  mean  reading  scores  for  Blacks  ware 
about  one  half  and  one  fourth  standard  deviations  below  those  for  the  Whites  in  the 
FR  and  TR  samples  respectively  and  the  standard  eirors  of  the  test  were  somewhat 
larger  for  Blacks  than  for  Whites,  in  contrast,  the  standard  deviations  and 
KR  20  coefficients  of  Blacks  and  Whites  wer*  substantially  similar.  These  findings 
arc  not  very  surprising;  In  fact,  they  are  consistent  with  the  usual  findings  for 
scores  of  Blacks  and  Whites  on  paper-and- pencil  tests. 

At  the  present  time,  comparison  of  potential  racial  bias  in  Item  characteristic 
curves  has  a  kind  of  stopgap  or  seat-of-the-pants  quality  because  the  distribution 
characteristics  of  the  Item  parameters  are  not  known.  Consequently,  there  is  no 
specific  test  which  is  generally  accepted  as  appropriate  to  evaluate  the  statistical 
significance  of  Black-White  differences  in  ICCs— as,  for  example  the  Gullikscn-Wilks 
test  is  appropriate  for  evaluating  the  statistical  significance  of  differences  in 
linear  regression  statistics  of  Blacks  and  Whites. 

In  the  Urry  program  s  2-scep  process  is  used  to  develop  item  parameters.  The 
first  estimates  of  the  parameters  are  based  on  standardized  total  test  scores  and  the 
second  are  computed  using  Bayesian  nodal  estimates  of  ability.  The  latter  values 
are  basically  estimates  obtained  by  applying  the  first  estimated  parameters,  I,  6, 
and  6,  to  Che  scored  items  for  each  examinee  to  compute  an  estimate  of  individual 
ability,  5.  In  turn,  the  probability  of  getting  the  item  correct  is  plotted  against 
these  os, and  revised  a ,  and  ^parameters  are  computed.  During  the  process  of  com¬ 
puting  double-hat  parameters,  cases  for  which  stable  estimates  of  cannot  be 
computed  are  dropped.  Also,  during  this  s(ep,  items  with ^ 'll,  and values  which 
have  poor  characteristics  are  excluded  from  the  calculation  of  '(J. 

Thus  the  ^  ability  estimates  have  been  purified  of  internal  inconsistencies  iu 
two  respects:  (1)  they  are  not  based  on  scores  from  items  with  ICCs  which  have 
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poor  characteristic*  and  (2)  they  do  not  contain  unstable  m»  values.  For  these  reasons 
^  might  be  expected  to  be  wore  accurate  than  RGL  as  an  estimate  of  ability.  In  addition* 
the  lvalue*  in  the  present  study  were  based  on  ICCs  which  were  specific  to  the 
individual  racial  group.  If  test  bias  were  a  cause  of  the  differential  in  the  Black-. 

White  screen  rates  of  RGLs,  comparisons  baaed  on  lvalues  computed  in  this  fashion  would 
be  expected  to  decrease  the  differences.  Based  on  there  considerations,  the  following 
sets  of  comparisons  were  carried  out  for  both  the  TR  and  the  FR  data  sets. 

1.  Counts  were  made  of  the  items,  categorized  as  shown  in  the  next  slide. 

2.  For  the  Items  in  categories  1  and  2  on  the  elide,  the  a,  b,  and  c  parameters 
of  Blacks  and  Whites  were  formed  into  bivariate  distributions  and  inspected  for 
similarities  and  differences. 

3.  Tuo-by-two  tables  relating  scoras  on  o  and  AT  and  RGL  and  AT  were  formed  and 
inspected. 

4.  AT  was  regressed  on  Tr  and  on  RGL  separately  for  Blacks  and  Whiten. 

3.  devalues  were  correlated  with  RGL  and  other  cognitive  measures  as  well  as  with 

an  additional  two  performance  criteria  which  were  available  on  the  records. 

Results 

For  both  Black*  and  Whites  the  percentage  of  item*  for  which  stable  a’.'B.^and  * 

£  and  ^  and  .^parameters  could  be  computed  together  with  the  percentages  of 
individuals  in  the  samples  who  had  stable’*  values  are  shown  In  the  next  table. 

The  following  relationships  are  apparent  from  the  table:  1.  Stable  ICCs  could 
be  computed  for  more  of  the  items  for  Black*  than  for  Whites.  2.  A  drastic  reduction 
in  percentages  of  items  having  stable  values  occurred  for  the  computation  of  the 
double-hat  parameters  versus  the  percentages  for  the  single-hat  parameters.  3.  In  general| 
the  number  of  items  for  which  stable  ICCs  could  be  computed  waa  somewhat  greater  for  the 
TR  chan  for  the  FR  data  set,  but  the  differences  were  not  large.  4.  The  percentages 
of  the  samples  for  which  stable  ability  estimates  could  be  computed  waa  slightly 
larger  for  Blacks  than  for  Whites. 

Thus  the  statistics  look  somewhat  better  for  Blacks  than  for  Whites. 

For  the  items  for  which  a,  b,  and  c  parameters  were  computed  for  both  Blacks  and 
Whites,  the  values  for  each  of  six  item  statistics  for  Blacks  sod  Whites  were  correlated. 
These  coefficients  are  shown  in  the  next  table. 

It  is  apparent  chat  tha  extent  of  agreement  among  Blocks  and  Whites  in  tsrms  of  the 
item  statistics  was  greater  for  the  TR  than  for  the  FR  data  set.  Also,  for  both  data 
sets,  the  extent  of  agreement  among  the  coefficients  was  lower  for  the  double-hat 
parameter  items  than  for  the  single-hat  parameter  items.  It  is  interesting  that  She 
agreement  is  particularly  high  for  the  two  ices  difficulty  statistics,  the  £  value  of 
classical  test  theory  and  the  b  parameter  of  latent  trait  theory.  The  extent  of 
agreement  for  the  slope  parameters  also  tends  to  fi.vor  the  statistic  of  classical  test 
theory,  the  point  biserlal  coefficient,  although  there  is  considerable  variation  in 
these  relationships  across  the  sets  of  items  und  in  one  sst,  the  coefficient  for  the 
a  parameter  is  higher  than  that  for  the  point  biserlal. 

The  agreement  was  greater  for  the  point  biserial  coefficient  than  for  the 
biserial  coefficient  for  the  single-hat  items  and  this  relationship  was  reversed 
for  double-hat  items.  The  values  for  the  c  parameter  had  substantial  correlations  for 
the  single-hat  parameter  items  but  these  correlations  were  completely  eliminated 
for  the  double-hat  items. 

Scattergrams  for  these  statistics  for  the  TR  Dane  Set  are  shown  in  the  next 
four  figures.  For  each  scattergram,  the  regression  line  of  Whites  on  Blacks  is 
drawn  as  a  solid  line  and  a  dashed  line  indicating  perfect  correlation  is  shown. 
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f  Figure*  2  and  6  illustrate  the  high  degree  of  agreement  between  the  item  diffi- 

1  culties  for  Blacks  and  Whites  which  was  characteristic  of  the  statistics  for  both 
1  classical  and  latent  trait  theory.  As  is  shown  by  the  intercept  of  Figure  2,  on 
i  average,  p  values  were  about  .11  higher  for  Whites.  In  contrast,  for  the  b  statistic, 
I  the  differences  between  the  racial  groups  were  almost  nonexistent. 

I  Figures  3  and  5  provide  graphic  illustration  of  the  greater  correspondence 

1  between  the  Black-White  distributions  for  the  point  biserial  coefficient  than  for 
I  the  a  parameter,  as  was  discussed  previously.  The  location  of  the  Intercepts 
I  Indicate  that  the  slopes  of  the  item-test  regressions  were  generally  greater  for 
I  Whites  for  both  the  classical  and  the  latent  trait  measures. 

The  next  table  provides  comparisons  for  Blacks  and  Whites  of  the  correlations  of 
'1r  and  RGL  with  three  criteria,  two  classification  tests  and  a  biographical  variable, 
years  of  education.  It  can  be  seen  that  the  correlations  in  general  are  considerably 
higher  for  Whites  chan  for  Blacks,  a  finding  which  is  depressingly  consistent  with 
those  of  most  studies.  In  addition,  the  hoped  for  increases  in  the  accuracy  of 
prediction  of  Recruit  Attrition  from  the  use  of  ur  estimates  of  ability,  not  only  did 
not  materialize,  but^waa  actually  not  as  good  as  RGL  for  predicting  most  of  the 
variables.  As  s  predictor  of  attrition  of  Blacks  during  Recruit  Training,  the  most 
Important  comparison,  did  not  even  have  statistically  significant  validity 


coefficients,  in  contrast  to  the  coaparable  coefficients  for  RGL  which  were  signifi¬ 
cant  at  the  .01  level.  The  same  types  of  relationship*  were  generally  characteristic 
for  the  other  variables  in  the  table.  In  general, ^  was  not  as  good  a  predictor  as 
RGL  for  criteria  and  did  not  correlate  as  high  aa  RGL  with  other  test  variables  and 
with  years  of  education. 

To  provide  a  comparison  with  the  Pearson  product  moment  correlation  coefficients 
computed  for  RGL  and  Recruit  Attrition  shown  in  the  previous  table,  phi  coefficients 
were  computed  for  Blacks  and  Whites  grouped  into  two-by-two  tables  using  the  RGL  and 
Recruit  Attrition  categories  that  were  shown  in  Tables  1  and  2.  The  results  of  these 
analyses  are  shown  in  the  next  table.  For  both  Blacks  and  Whites  the  coefficients 
are  generally  about  two  or  three  points  higher  than  the  Pearson  rs  shown  in  the 
previous  table.  Thus  if  it  were  desired  to  provide  a  pre-enlistment  screen  for  low 
|  readers,  the  best  cut  point  would  be  at  an  RGL  of  4.0. 

I  However,  the  poor  performance  of’fr  a>  a  predictor  of  Recruit  Attrition  was 

disturbing.  Therefore  an  additional  set  of  analyses  was  performed  to  throw  some 
light  cn  the  reasons  for  this  phenomenon. 

As  is  shown  in  the  next  table,  the  two  parts  of  the  CM,  Vocabulary  and  Reading 
Comprehension,  were  moderately  speeded  and  not  everyone  finished  all  of  the  items. 

The  speededness  was  somewhat  greater  for  the  Reading  Comprehension  section  than  for 
the  Vocabulary  section,  and  for  the  TR  than  for  the  FR  data  sets.  For  both  the  FR 
and  tho  TR  data  sets.  Blacks  were  considerably  slower  than  Whites.  In  general,  for 
any  number  of  itema  omitted,  the  percentage  of  the  Black  group  at  that  level  was  from 
30  to  200  percent  greater  than  the  percentage  of  the  White  group  at  that  level. 

The  next  table  presents  the  same  number-of-items-omitted  comparison;  however, 
the  sample  for  wh'ch  it  was  made  consists  of  personnel  remaining  after  computation  of 
the  double-hat  parameters.  You  can  nee  that  for  persons  in  the  refined  sample,  the  number 
of  omitted  items  has  been  reduced  so  that  the  percentages  omitting  items  st  any  level 
is  only  about  half  to  two  thirds  of  the  comparable  percentage  shown  in  the  previous 
table.  The  reduction  Is  particularly  drastic  for  Whites  in  the  FR  data  set.  This 
indicates  that  an  effect  of  the  refinement  process  was  primarily  to  eiiStinsCC  I 

personnel  having  large  nus&ers  of  osiitted  itema.  1 


Extrapolations  were  made  from  the  complete  percentage  of  omissions  figures  to  j 

compute  a  mean  number  of  ite^a  omitted  for  each  of  the  eight  groups.  Thus  for  the  FR  i 

data  set, the  mean  items  omitted  by  Blacks  f,or  the  complete  and  the  refined  samples 
were  2.43  and  1.05,  respectively.  Comparable  means  for  the  Whites  were  1,05  end  .19.  , 

For  the  TR  data  set  the  means  were  3.27  and  2.21  for  Blacks  and  2.16  and  1.01  for  j 

Whites.  Pnusv  on  average  Blacks  omitted  one  more  item  on  the  test  than  did  Whites  and  j 

the  falloff  in  mean  number  of  Items  omitted  in  the  refined  sample  was  about  one  item 
for  each  of  the  racial  groups.  ! 
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person^lS  "»*PeCti0nS  of  the  dat*  fu«**r  indications  were  found  that 

wWilv  hld^riLJ  C0,ipUter  progra*  durin*  th0  refinement  process 
SSt«rw  ffi  *  I"*  d*temined  that  ^  relationship  was 
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-  ♦  zir- 

Discussion  and  Conclusions  c  8  aatt  level* 

Whatever  the  advantages  of  latent  trait  scaling  may  b*  In  terms  of  economy  and 
accuracy  of  measurement,  in  the  present  study  the  technique  did  not  serve  to  provide 
a  better  predictor  for  Blacks  and  Whites  than  the  commonly  used  RGL.  From  the  data 
presented  it  is  clear  that  several  reasons  were  responsible  for  the  failure.  The 
major  one  was  that,  because  of  characteristics  of  the  program,  lvalues  s'mply  were 
not  available  for  the  personnel  having  the  lowest  reading  abilities,  the  restriction 
of  range  resulting  from  this  deficiency  undoubtedly  lowered  the  Pearson  rs  computed 
for 

However,  not  all  is  lost  for  the  latent  trait  model.  For  the  present  research 
latent  trait  scoring,  even  after  reducing  the  number  of  items  between  25  and  50 
percent  for  Blacks  And  Whites,  respectively,  provided  a  measure  which  substantially 
reproduced  the  total  score  on  the  complete  GitwS“MacOinitie  test.  Even  the  present 
set  of  items,  which  in  terms  of  their  latent  trait  parameters  are  considerably  be¬ 
low  tne  desirable  level,  could  save  a  great  deal  of  administration  time  if  they 
were  administered  by  means  of  a  branching  technique  which  selected  each  new  item  to 
correspond  as  closely  as  possible  to  the  estimated  ability  of  the  examinee. 

A  number  of  comparisons  in  the  study  suggest  that  the  item-test  relationships 
were  substantially  similar  for  Blacks  and  Whites.  The  correlations  of  ^  with  RGL  were 
94  and  90  respectively,  for  Blacks  and  Whites.  This  suggests  a  high  degree  of 
relationship  between  the  o'  values  and  the  possibly  contarainlAted  RGL.  In  addition 
the  item  difficulty  and  slope  parameters  for  the  items  reviewed  indicate  a  con¬ 
siderable  correspondence  in  these  relationships  for  Blacks  and  Whites.  Although 
these  statistics  were  undoubtedly  for  the  best  items  In  the  test,  the  high  re¬ 
lationship  between  the  RGL  and '0  suggests  that  the  other  items  in  the  test  do  not 
substantially  modify  the  correspondences  in  item  statistics  of  Blacks  and  Whites. 

Indications  from  inspecting  the  correlations  of  the  item  statistics  suggest 
that  greater  divergences  between  Blacks  and  Whites  occur  for  the  latent  trait 
statistics  than  for  the  statistics  of  classical  ccsr  theory.  The  determination 
of  which  of  these  sets  of  statistics  most  accurately  describes  relationships  in 
the  real  world  must  be  left  to  future  research. 

The  above  data  suggest  that  it  would  be  desirable  to  look  in  other  locations 
for  the  explanation  of  the  lower  than  would  be  predicted  attrition  rates  for 
Blacks.  Two  possible  reasons  for  this  phenomenon  suggest  themselves:  (1)  Blacks 
assigned  to  Academic  Remedial  Training  may  simply  try  harder  than  Whites  assigned 
to  Academic  Remedial  Training,  or  (2)  because  of  the  current  concern  with  increasing 
the  proportions  of  Blacks  in  the  Navy  and  the  subsequent  reluctance  to  eliminate 
Blacks,  teaching  personnel  in  ART  may  either  provide  more  assistance  to  Blacks  than 
Whites  in  the  training,  or  may  employ  more  lenient  evaluation  standards  for  Blacks 
than  for  Whites,  or  both.  Any  of  these  effects  would  bring  about  a  difference  among 
Blocks  and  Whites  in  the  strength  of  the  reading  ie*el  -  Recruit  Attrition  relation¬ 
ship. 
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SAMPLE  SIZES  AMD  TEST  STATISTICS  POR  THE  DATA  SETS 


COMPARISON  Or  BLACKS  AND  WHITES  IN  TERMS  OF  ITEMS  HAVING 


Product  Moment  Correlations  of  0  and  RGL  for 
Written  Test,  Biographical  and  Criterion  Variables 


FR  Data  Set  TR  Data  Set 


Variable 

8 

Black  White 

RCi 

Black 

* 

White 

A 

J> 

Black 

White 

RGL 

Black  White 

Successful  Completion  of 
Recruit  Training 

.05  |  .09** 

.09** 

. 34*** 

.04 

1 

• 

o 

O' 

.10** 

.19*** 

Highest  Paygrade  Received 

i 

.24***  .25*** 

.26*** 

.40*** 

.10** 

.01 

.16*** 

.23*** 

Total  Number  of  Promotions 

.00  -.04 

1 

.05 

.18*** 

.05 

r 

• 

o 

Cfl 

.08* 

.18*** 

GO¬ 

.72***  .69***! 

1  j 

.??*** 

1 

.68*** 

.54*** 

|  .47*** 

.59*** 

.53*** 

AR  I 

1 

.51***  .44*** 

:  i 

1 

.48*** 

1 

.47*** 

.27*** 

! 

| 

.23*** 

.29*** 

| 

.31*** 

Yenrs  of  Education 

i  i 

.21***  .30***  !.21*** 

. 30*** 

l _ 

.05 

!  .09** 

j.  10** 

.13*** 

I,»*  l ,  •*  t.«t  t.l«  >,H  l,«l  S.tt  >.*«  «,lt 


>•  SCATTEKCRAM  OF  THE  ITEJ'^f  PARAXETEJsS  OF  SLACKS  AXO  WHITES 


PERCENTAGE  OF  REFINED  SAMPLE  OMITTING  ITEMS 


No.  Items  Omitted  FR  DATA  SET 

Black  White 


Part  l  (Vocabulary) 

5  5  1 

10  2  0 

15  10 

20  10 

25  1  0 

Part  2  (Reading  Comprehension) 

5  7  1 

10  2  0 

15  1  0 

20  0  0 


TR  DATA  SET 
Black  White 


8 

4 

2 

2 


5 

3 

0 

0 

0 


11  5 

4  2 

2  1 

0  0 


Categorization  of  Blacks  and  Whites  in  the  TR  Data  Set 
In  Terms  of  0  Ability  Estimates 
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INITIAL  TEST  OF  ARMORED  CAVALRY  ENGAGEMENT  SIMULATION 
Claramae  S.  Knerr 

US  Army  Research  Institute  for  the  Behavioral 
ami  Social  Sciences 

MAJ  Angelo  A.  Severino 
US  Army  Training  Support  Center 

John  J.  Bosley 

Human  Sciences  Research,  Inc. 

Engagement  simulation  is  a  generic  term  for  training  techniques 
:hat  provide  realistic  tactical  training  under  conditions  that  simulate 
the  complex  modern  battlefield.  The  emphasis  for  the  realism  is  on  the 
psychological  fidelity  of  the  training  environment  end  procedures  (Root, 
1976^.  Fidelity  factors  include  the  cues  to  which  the  soldierB  must 
respond,  their  opportunity  to  respond,  and  changes  in  the  situation  as  a 
result  of  their  actions.  Three  characteristics  of  engagement  simulation 
(ES)  exercises  contribute  to  psychological  fidelity:  (a)  they  are  two- 
sided,  free-play  tactical  exercises,  with  (b)  objective,  realtime 
casualty  assessment,  and  (c)  simulation  of  all  weapons  effects  and 
signatures. 

The  earliest  type  of  ES,  called  SCOPES  (for  Squad  Combat  Operation* 
Exercise,  Simulated),  was  developed  for  infantry  squads.  In  SCOPES 
exercises,  squads  conduct  two-sided  free-play  exercises,  so  that  each 
force  opposes  a  motivated,  intelligent  enemy-  Objective  casualty 
assessment  is  achieved  when  a  soldier,  looking  through  a  six  power 
telescope  mounted  on  his  M16  rifle,  correctly  reads  a  three  inch,  two 
digit  number  on  the  helmet  of  an  opposing  unit  member.  The  power  of  the 
telescope  and  the  sir.e  of  the  helmet  number  are  calibrated  to  produce 


hit/kill  probabilities  realistic  for  the  weapon's  lethali*-w.  A  casualty 
is  assessed  when  the  soldier  fires  a  blank  round  and  correctly  identi¬ 
fies  the  opposing  helmet  number.  The  soldier  must  fire  a  blank.  If 
not,  it  is  considered  a  misfire  and  no  casualty  is  assessed.  A  con¬ 
troller  with  the  fire  team  radios  the  helmet  number  to  the  controller 
with  the  opposing  element,  who  informs  the  target  soldier.  Soldiers  who 
have  been  "hit"  must  remove  their  helmets,  lie  down,  and  not  communicate 
or  otherwise  participate  in  the  exercise. 

Physical  fidelity  of  this  casualty  assessment  method  cannot  be 
considered  high.  The  only  indication  that  soldiers  have  that  they  are 
casualties  is  when  a  controller  tells  them  that  they  have  been  hit. 

However,  because  they  know  that  casualties  are  assessed  using  strict 
rules,  they  know  that  they  have  performed  incorrectly  (e.g.,  did  not 
stay  under  cover).  Therefore,  the  situation  has  psychological  fidelity. 
Soldiers  learn  very  quickly  to  low  crawl. 

Procedures  to  conduct  tactical  ES  exercises  with  combined  arms 
elements  have  been  developed  and  implemented  under  the  name  REALTRAIN, 
Procedures  for  objective  casualty  assessment  have  been  established  for 
the  M60  machinegun,  hand  grenade,  M18A1  Claymore,  M16A1  anti-personnel  and  M21 
anti-tank  mines,  tank  main  gun,  and  light,  medium,  and  heavy  anti-tank 
weapons  (LAW,  DRAGON,  and  TOW).  For  weapons  with  longer  range  than  the 
M16  rifle,  the  controller  is  equipped  with  optics  to  sight  the  indivi¬ 
dual  helmet  numbers,  or  numbers  on  panels  attached  to  the  vehicles.  For 
example,  tank  controllers  have  ten  power  breech  mounted  telescopes,  and 
controllers  witn  TOW  gunners  have  ten  power  telescopes  mounted  on  the 
TOW  sight. 
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Indirect  fire,  either  mortar  or  artillery,  is  simulated  by  de¬ 
tonating  artillery  burst  simulators  at  the  actual  impact  location 
requested  by  the  players.  The  artillery  simulators  are  delivered  by 
fire  markers,  usually  mounted  in  jeeps.  Controllers  with  the  players 
assess  casualties  within  the  "kill  radius"  of  the  simulated  rounds  when 
the  simulators  are  detonated.  For  example,  exposed  soldiers  within  a  50 
meter  radius  of  a  simulated  4.2  inch  (107  mm)  mortar  burst  are  assessed 
as  casualties,  and  vehicles  lose  communications  although  they  are  not 
destroyed.  The  corxunication  loss  enhances  the  psychological  fidelity 
by  simulating  confusion  caused  by  indirect  fire. 

The  sights  and  sounds  of  battle  are  represented  hy  pyrotechnics. 
Each  soldier's  weapon,  crew-served  weapon,  and  armored  vehicle  is 
equipped  with  pyrotechnics  to  simulate  the  flash  and  noise  of  the  weapon 
signature.  These  signature  simulators  provide  an  important  aspect  of 
the  psychological  fidelity.  The  firer  may  have  an  excellent  position, 
but  upon  firing  the  signature  changes  the  situation  by  cuing  the  enemy 
as  to  the  firer's  location.  The  firer  is  likely  to  be  "hit"  unless  he 
moves  after  firing.  The  signature  simulators  do  not  exactly  reproduce 
the  actual  weapon  signature,  however  they  do  produce  situation  changes 
that  necessitate  realistic  player  responses. 

ES  training  entails  three  stages:  (a)  a  free-play  tactical  exer¬ 
cise,  (b)  After  Action  Review  (AAR),  ^nd  (c)  successive  repetitions  of 

'■4* 

the  exercises  and  AAR.  The  exercise  provides  performance  training  under 
realistic  tactical  conditions  in  a  discovery,  or  trial  and  error, 
paradigm  coupled  with  struct'. red  feedback.  Each  exercise  is  followed  by 
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an  AAR  which  recreates  the  action  and  provides  additional  information  to 
the  soldiers  as  to  the  consequences  of  their  actions.  Soldiers  who 
"killed"  another  soldier  or  vehicle  describe  how  they  detected  and 
"destroyed"  the  enemy.  Soldiers  who  were  "casualties"  hear  from  their 
peers  the  errors  that  led  to  their  being  "hit."  Although  disagreement.? 
arise,  which  are  sometimes  very  spirited  since  motivation  and  compe¬ 
tition  are  high,  the  objective  casualty  assessment  system  indicates 
convincingly  "killers"  and  the  "killed."  Feedback  from  the  opposing 
force  reinforces  learning  through  peer  dialogue.  The  AAP  leader  guides 
the  discussion,  but  does  not  critique  or  lecture. 

The  AAR  leader  is  usually  a  senior  controller,  who  is  not  assigned 
to  a  participating  vehicle,  but  serves  to  coordinate  the  controllers, 
control  the  exercise  as  a  whole,  and  act  as  the  unit  commander.  The  AAR 
leader  uses  a  record  of  the  casualties  to  guide  the  discussion,  which 
recaps  the  exercise  chronologically.  The  record  is  maintained  in  an 
exercise  control  station  where  personnel  write  the  time  and  elements 
"hit"  as  the  controllers  report  them  on  the  radio.  For  individual 
soldier  casualties,  for  example,  the  controller  reports  "29  killed  by 
45,  29  killed  by  45."  The  controller  with  individual  29  acknowledges 
the  "hit"  by  reporting  "29  confirmed,  29  confirmed. "  The  exercise  net 
control  station  (NCS)  recorder  writes  the  time,  target  number,  firer 
number,  and  checks  that  the  "hit"  was  confirmed.  Flius,  the  sequence  of 
casualties  is  recorded  for  the  AAR  leader  to  guide  the  discussion. 


Between  the  exercif  ;  and  the  AAR,  the  AAR  leader  meets  with  the 
controllers  to  review  the  NCS  record,  correcting  and  augmenting  it  to 
enhance  the  AAR.  This  controller  debrief  is  used  to  settle  controver¬ 
sies  over  "hits"  and  derive  training  points  to  emphasize  in  the  AAR. 

Validations  of  both  the  infantry  squad  SCOPES  and  combined  arms 
REALTRA1N  indicate  that  E.f  training  is  effective  in  achieving  tactical 
proficiency.  Tactical  ES  trained  units  improved  in  aspects  of  tactical 
proficiency  such  as  (a)  maximizing  effects  of  avillable  weapons  on  the 
enemy,  (b)  minimizing  effects  of  enemy  weapons,  (c)  effective  intra-  and 
inter-unit  coordination,  and  (d)  adaptive  response  to  enemy  actions  in  a 
dynamic  combat  situation.  In  the  SCOPES  validation,  performed  this  May 
at  Port  Ord,  SCOPES  trained  squads  were  compared  with  conventionally 
trained  squads  (uanks,  J.H.,  Hardy,  G.D.,  "cott,  T.D.,  Kress,  G.,  and 
Word,  L.E.,  1977).  In  the  REALTRAIN  validation,  performed  in  Europe  in 
1975-1976,  combined  arms  units  with  three  weeks  of  ES  training  were 
compared  with  similar  units  in  their  first  week  of  ES  training  (Root, 
R.T.,  Epstein,  K.I.,  Stcinceiser ,  F.H.,  Hayes,  J.F.,  Wood,-  S.E.,  Sulzeu, 
R.H.,  Burgess,  G.C. ,  Mlrabella,  A.,  Erwin,  D.E.,  and  Johnson,  E.,  1976). 
In  addition  to  the  performance  indicators  listed  above,  the  controllers 
end  participants  reported  that,  in  their  opinions,  the  ES  exercises 
provided  effective  training  (more  effective  lhan  conventional  training). 

The  nature  of  armored  cavalry  presented  a  threefold  challenge  for 
ES  development:  "the  reconnaissance  function,  a  combined  arms  composi¬ 
tion,  and  the  inclusion  of  mortar.  First,  the  armored  cavalry  functions 
as  the  "eyes  and  earn"  of  the  maneuver  forces,  performing  reconnaissance 
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missions,  information  gathering  and  reporting.  These  missions  do 

not  lead  to  the  casualty  producing  engagements  typical  of  other  maneuver 
arms  tactical  training.  In  some  instances,  they  are  one-sided,  with  no 
firing  or  casualty  assessment,  thus  no  weapons  effects  or  signature 
simulators.  Thus,  all  three  aspects  of  ES  that  enhance  psychological 
fidelity  would  be  Inoperative.  An  example  would  K  reconnaissance  of  an 
area  that  does  not  contain  ensmy  elements.  In  other  instances,  enemy 
elements  may  be  present,  so  that  the  exercise  is  tvo-sided.  If  the 
opposing  elements  fire,  then  the  exercise  converts  to  #  casualty-pro¬ 
ducing  ES  training,  and  the  standard  ES  procedures  apply.  On  the  other 
hand,  if  they  do  not  fire,  but  continue  to  perform  Information  gathering 
and  reporting  f mictions  (e.g. ,  reports  of  enemy  detection)  then  the 
reconnaissance  activities  esn  b#  reenacted  in  the  AAR.  However,  without 
special  techniques,  the  AAR  dialogue  would  be  the  opinion  of  one  op¬ 
posing  fores  against  the  other,  still  lacking  the  objective  assessment 
that  makes  ES  casualties  credible  end  convincing. 

Second,  the  armored  cavalry  platoon  is  the  smallest  combined  anas 
force  in  the  Army,  containing  scout,  light  armor,  infantry,  arut  mortar 
sections.  Tha  task  of  simulating  the  sntlrs  array  of  armored  cavalry 
weapons  set  the  scope  of  the  development,  with  the  complexity  of  thv 
exercises  s  major  concern.  For  successful  dsvslopstent  and  eventual 
implementation,  the  cavalry  ES  system  had  to  be  as  simple  as  possible 
for  unite  to  employ. 

Third,  the  mortar  section  1*)  organic  to  the  armored  cavalry  pic- 
toon,  therefore  it  was  included  in  the  tactical  exercises.  In  pest  ES 
exercises  indirect  fire  elements  were  merely  simulated,  and  fire  markers 


delivered  artillery  buret  simulators  to  indirect  fire  Impact  locations. 

In  contrast  to  the  previous  indirect  fire  methods ,  the  mortar  section 
vas  physically  present  with  the  maneuver  forces  in  the  araored  cavalry 
exerciaei. 

In  addition  to  the  aspects  unique  to  armored  cavalry,  all  of  the 
ur.ual  aspects  of  engagement  simulation,  described  belov,  required 
development. 

Weapon  effects  and  signature  simulation.  New  hardware  and  accom¬ 
panying  procedures  for  its  use  were  developed  for  some  weapons,  including 
the  M551  Sheridan,  Armored  Reconnaissance  Airborne  Assault  Vehicle  (with 
conventional  HEAT  and  Shillelagh  missile),  K114  scout  vehicle,  Armored 
Command  and  Reconnaissance  Carrier  (with  20mm  cannon),  and  the  k.2  inch 
(107mm)  mortar  on  the  M\06  Armored  Mortar  Carrier.  Signature  simu^ 
lators,  controller  optics,  and  rules  for  their  use  were  duvlsed. 

Exercise  control  Controller  duties,  rules  of  engagement,  casualty 
assessment,  and  controller  communications  were  tailored  to  the  vehicle 
type,  crew,  and  weapon  system.  Each  vehicle  had  one  controller,  except 
for  the  Infantry  Mil 13  which  had  two  controllers,  one  for  each  fire  team. 
Each  opposing  force  had  a  senior  controller  who  functioned  ee  the  next 
higher  unit  coamander.  The  senior  controller  acts  as  the  troop  com¬ 
mander  whan  armored  cavalry  platoons  ars  the  opposing  forces. 

Exercise  recording.  In  typical  REALTRAIN  exercises.  Net  Control 
Station  (NCS)  personnel  record  the  simulated  casualties  end  confirm- 
scions.  To  incorporate  reconnaissance  information,  fightings  (any 
detection  of  enemy  activity  and  alements)  were  also  reported,  confirmed. 


and  recorded  on  tha  NCS  racord.  Othar  aathoda  for  recording  recon¬ 
naissance  information  Here  field  notes,  kept  by  the  vehicle  controllers, 
and  logs  of  the  tactical  radio  nets. 

After  Action  Review.  The  REAL TRAIN  NCS  records,  tactical  notes, 
and  reconnaissance  information  were  compiled  during  the  controller 
debrief  after  the  exercise,  and  used  aa  input  to  the  AAk.  More  emphasis 
was  placed  on  information  gathering  and  reporting  than  in  AAJta  for 
typical  REALTRAIN  axarclaas. 


OBJECTIVES 

The  overall  objective  was  to  develop  engagement  simulation  for 
armored  cavalry.  Specifically,  the  objectives  in  casting  the  candidate 
procedures  were  to  examine: 

1.  Procedures  designed  to  emphasise  the  reconnaissance  functions 
in  ES  exercises, 

2.  Procedures  for  incorporating  reconnaissance  functions  into  the 
controller  debrief  and  the  After  Action  Review, 

3.  Controller  procedures  and  the  control  system,  and 

4.  Effectiveness  of  the  weapons  effects  and  signature  slmulioors 
for  armored  cavalry  weapons. 


METHOD 

To  meet  the  research  objectives,  the  following  types  of  instru¬ 
ment  s,  described  in  the  paragraphs  below,  were  developed  for  data 


collection: 


1.  Records  of  information  gathering,  and  reporting  functions,  and 

2.  indirect  aeesures  such  as  attitudinal  data  concerning  the  pro¬ 
cedures,  simile tors,  training  value,  and  AAR, 

Records  of  information  gathering  aad  reporting  functions.  A 
variety  of  procedures  were  tested  to  incorporate  reconnaissance  func¬ 
tions  into  the  exercises.  Some  of  the  forms  used  to  gather  data  on 
these  functions  are  forms  primarily  used  in  support  of  the  ES  training 
method,  for  recording  the  exercise  events.  The  first  one  was  the 
casualty  record  sheet  typically  maintained  by  the  net  control  station 
(NCS)  during  the  exercises.  The  NCS  record  includes  the  target,  fleer, 
time,  and  confirmation  of  each  casualty.  This  casualty  record  sheet  was 
alteied  to  include  reports  of  enemy  detections  (e.g.,  by  sighting  the 
enemy)  In  addition  to  casualties.  The  detection  was  called  ov*r  the 
exercise  control  net,  analogous  to  the  call  of  a  casualty.  The  target 
(sighted  enemy  element,  in  the  case  of  sightings),  first  (element  that 
sighted  the  enemy),  tine,  and  confirmation  were  recorded  on  the  NCS 
sheet.  The  altered  sheet  is  shown  in  Figure  1. 

The  senior  controllers  kept  notes  during  the  exercises,  largely 
critical  incident*  and  reconnaissance  information  from  the  troop  tacti¬ 
cal  nets.  The  senior  controller  who  conducted  the  AAR  used  these  notes 
to  reconstruct  Che  action  and  to  focus  discussions  of  the  reconnaissance 
functions. 

Printed  three  by  five  cards  were  prepared  for  thw  vehicle  and 
iefantry  fire  teem  controllers  (Figure  2).  The  controllers  were  in¬ 
structed  to  write  the  individual  helmet  and  vehicle  numbers  on  one  side 
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of  che  card  at  the  start  of  the  exercise.  This  fora  was  designed  to 
encourage  the  controllers  to  keep  records  of  the  numbers,  particularly 
thert  they  control,  and  the  casualtiea.  These  practices  had  proven 
useful  in  previous  ES  exercises.  On  the  reverse  side  of  the  card, 
controllers  were  instructed  to  write  notes  to  assist  their  participation 
in  the  controller  debrief  (orders,  plans,  troop  questions,  problems,  and 
enemy  detections). 

Indirect  Measures.  Questionnaires  were  developed  to  record  subjec¬ 
tive  Judgments  of  the  participants  and  controllers.  Partlcipents  were 
asked  to  rate  training  value,  simulator  credibility,  and  utility  of  the 
candidate  procedures.  Vehicle  and  infantry  fire  team  controllers  were 
asked  about  casualty  assessment  and  other  ES  procedures,  hardware 
utility,  simulator  credibility,  controller  debrief  and  AAR,  and  training 
value  of  the  exercises  for  the  controllers. 

PRELIMINARY  FIELD  TESTS 

Procedures  were  drafted  for  armored  cavalry  £S  and  were  examined 
and  revised  in  t  series  of  field  tests.  These  field  tests  were  develop¬ 
mental  in  nature,  rather  than  validations  of  a  completed  system. 

Although  data  were  collected  whenever  possible,  no  formal  experiments 
were  conducted.  Field  validation,  as  conducted  for  SCOPES  and  REALTRAIN 
awaits  completion  of  an  initial  system  for  armored  cavalry  ES. 

Figure  3  summarises  the  field  tests  for  Armored  Cavalry  ES.  Some 
small  scrie  exploratory  tests  were  run  to  examine  the  draft  procedures 
and  the  hardwara  devised  to  simulate  the  armored  cavalry  weapons.  The 
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FIGURE  3.  DEVELOPMENTAL  TEST  SUMMARY 


exploratory  tests  were  done  during  Basic  NCO  Courses  (BNCOC)  at  Ft  Hood 
and  Ft  Bliss.  BNCOC  prepare  soldiers  for  squad  leader  (E6)  positions  in 
infantry,  armor,  artillery,  combat  engineer,  and  air  defense.  The  course 
contains  three  days  of  ES  exercises. 

Ft  Hood  BNCOC  Exercises.  The  Ft  Hood  BNCOC  ES  exercises  contained 
infantry  squads  and  tanks,  with  approximately  ten  vehicles  per  exercise. 
The  instructors  added  a  scout  squad  to  one  of  the  opposing  forces  and 
used  it  with  no  difficulty.  The  M60  machineguns  mounted  on  the  scout 
vehicles  had  been  simulated  in  engagement  simulation  before,  and  the 
procedures  were  satisfactory  for  the  scouts.  These  exorcises  were  an 
important  first  step  in  the  armored  cavalry  engagement  simulation  de¬ 
velopment.  Although  no  formal  date  were  collected,  the  inclusion  of 
scouts  was  obviously  feasible.  The  BNCOC  Instructors  provided  ideas 
that  were  very  valuable  for  further  development  of  armored  cavalry  ES 
and  they  tried  out  data  collection  forms  prior  to  use  in  larger  exercises. 

Ft  Bliss  BNCOC  exercises,  At  the  Ft  Bliss  BNCOC,  tests  focused  on 
integration  of  the  reconnaissance  functions  and  initial  use  of  pro¬ 
cedures  for  the  Sheridan  M551  weapons  effects  and  slg  re  simulators. 
Three  exercises  were  conducted  employing  scouts,  light  armor  and  in¬ 
fantry  squads  vith  approximately  eight  vehicles  per  exvrcise.  BNCOC 
students  served  as  controllers,  and  some  of  the  instructors  assisted 
with  the  managing  of  the  exercises.  Their  highly  skilled  assistance  in 
small,  easily  manageable  exercises  facilitated  the  examination  of  new 


procedures. 


The  scouts  were  aounted  In  Mil 3s  with  .50  caliber  aachineguns  and 
in  the  same  type  of  vehicles  that  had  been  tested  at  Ft  Hood.  Since 
both  of  these  vehicles  and  their  weapons  had  been  tested  in  ES  exercises 
before,  it  was  siaply  verified  that  their  simulation  was  satisfactory. 
The  priaary  vehicle  to  be  exaained  during  the  Ft  Bliss  BNCOC  exercises 
was  the  Sherldar  M551. 

The  intended  signature  slaulator  for  the  Sheridan  aain  gun  is  the 
Hoffaan  device,  which  has  been  used  successfully  as  the  H60  tank  aain 
gun  siaulator  in  previous  ES  exercises.  Unfortunately,  Hoffaan  rounds 
were  not  available  for  these  tests.  The  substitute  was  an  M116  hand 
grenade  siaulator,  detonated  to  simulate  the  noise  and  flash  of  the  gun. 
The  Hoffaan  device,  which  provides  more  realistic  noise  and  flash,  is 
the  preferred  siaulator. 

A  modified  alssile  aft  cap,  with  a  ten  power  telescope  inserted  in 
the  center,  was  used  In  the  breech  of  the  aain  gun  as  the  controller 
telescope.  It  seemed  to  be  satisfactory  during  the  BNCOC  exercises. 

Sosa  aethods  for  incorporating  the  reconnaissance  functions  were 
pretested  at  the  Ft  Bliss  BNCOC.  Enemy  detection  information  was  re¬ 
ported  over  the  exercise  control  net.  A  large  nuaber  of  sightings  (25) 
were  reported  but  few  were  confirmed  (4).  These  reports  contributed 
little  to  the  AAR  Reporting  thea  over  the  exercise  control  net  sub¬ 
stantially  Increased  the  load  on  the  net  depending  on  the  nuaber  of 
reports  attempted. 

Vehicle  and  infantry  fire  teaa  controllers  in  the  exercises  were 
provided  with  3x5  cards  for  thea  to  record  the  ES  nuabers  and  their 
notes  for  the  controller  debrief.  Alaost  all  of  the  BNCOC  controllers 
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used  ths  cards.  The  controllers  used  the  cards  In  19  of  the  20  possible 
instances:  18  of  these  were  used  for  notes  for  the  controller  debrief, 
and  11  were  used  to  record  the  ES  number*  used  during  the  exerciser. 

The  senior  controllers  kept  notes  during  the  exercise  or  had 
assistants  in  their  jeeps  to  help  then  keep  the  notes.  They  used  these 
notes  during  the  AARs  to  reconstruct  the  action.  They  incorporated  the 
notes  with  the  maneuver  exerciss*  control  net  records  which  contained 
sightings  reported  over  the  control  net.  However,  even  with  all  these 
sources  of  data,  or  perhaps  because  of  it,  it  was  very  difficult  to 
incorporate  the  reconnaissance  information  into  the  AAR.  There  were  too 
many  source*  of  Information  and  the  functions  were  too  complex  to  bring 
together  quickly  after  the  exercise  in  the  field  environment. 

The  BNCOC  participants  and  controllers  completed  questionnaires 
(described  In  the  indirect  Measures  section  of  this  report)  and  com¬ 
mented  on  them.  Their  responses  and  comments  were  used  to  revise  the 
questionnaires  prior  to  their  use  in  larger  exercises. 

PLATOON  EXERCISES 

Armored  cavalry  ES  procedures  were  revised  on  the  basis  of  the 
BNCOC  results,  and  tested  in  May,  1977,  with  troop  support  provided  by 
the  3d  Armored  Cavalry  Regiment  (ACR),  Port  Bliss,  Texas.  C  Troop,  1st 
Squadron,  was  the  test  unit. 

Controller  Training.  The  first  three  days  of  the  two  week  te  t 
were  devoted  to  controller  training,  in  which  C  Troop  personnel  were 
trained  in  ES  procedures,  controller  duties,  and  the  After  Action 
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Review.  Approximately  eight  houre  of  practical  exercises  ware  conducted 
for  controllera  and  participants  to  practice  their  duties.  In  these 
practical  exercises,  the  platoon  was  divided  into  sections,  so  that  the 
opposing  forces  were  scouts  versus  scouts.  Infantry  versus  Infantry,  and 
Sherldana  versus  Sheridans. 

A  controller  communication  exercise  was  conducted  to  familiarise 
the  controllers  with  ES  procedures  and  duties.  The  full  complement  of 
controllers  practiced  sending,  receiving,  and  confirming  typical  ES 
control  messages  over  a  radio  net  prior  to  the  first  full-aired  exercise 
they  controlled.  The  transmissions  were  tape  recorded  and  played  back 
for  discussion.  The  cession,  with  play  back,  was  conducted  twice.  This 
training  was  evaluated  so  favorably  that  it  was  Incorporated  forthwith 
in  the  REALTRAIN  implementation  program. 

Exercises.  Six  days  of  platoon  versus  platoon  exercises  were 
conducted,  with  one  exercise  and  After  Action  Review  each  day.  The 
armored  cavalry  platoon  composition,  with  vehicles  orgonlc  to  the  test 
uu.it  are  shown  in  Figure  A.  Due  to  staffing  levels  and  maintenance 
requirements,  fewer  than  the  full  complement  of  ten  vehicles  per  platoon 
participated  In  some  of  the  exercises.  Each  platoon  in  C  Troop  parti¬ 
cipated  as  one  of  the  opposing  forces  in  four  exercises  (Table  1),  and 
as  controllers  in  the  other  tvo  exercises. 

Missions.  Missions  were  selected  from  the  Army  Training  and 
Evaluation  Program  for  Armored  Cavalry  Squadron  and  Armored  Cavalry 
Troop  (ARTEP  17-55),  with  assistance  of  1st  Squadron  psrsonnel.  The 
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Fig,  4.  Armored  Cavalry  Platoon  Composition  During  ES  Exercises, 
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TABLE  1.  PLATOONS  AND  MISSIONS  BY  EXERCISE 


aissions  were  chose  appropriate  for  a  regiaental  squadron,  were  con¬ 
sidered  to  be  of  training  benefit  to  C  Troop,  and  emphasized  the  re¬ 
connaissance  functions.  They  wre  representative  of  level  1  ARTEP 
aissions,  i.e..  those  comprising  the  minimum  acceptable  perforaance  for 
s  combat-ready,  full  strength  unit.  Missions  were  paired  in  each 
exercise  so  that  cue  platoon  had  a  reconnaissance  mission  while  the 
opposing  platoon  had  a  screen  or  delay  mission  (Table  1).  The  platoon 
with  the  screen  or  delay  mission  had  time  to  prepare  a  position  before 
the  opposing  platoon  moved  to  contact.  The  1:1  force  ratio  was  tac¬ 
tically  unrealistic  for  what  amounted  to  an  attach  against  a  prepared 
defense,  but  it  was  highly  desirable  to  train  each  platoon  as  a  unit. 

Terrain.  The  training  area  was  flat  desert,  having  only  40  feet 
difference  between  the  high  and  low  elevation.  It  was  dotted  wiih  sand 
dunes  end  low  scrub  vegetation.  Unpavwd  trails  were  the  only  features 
that  assisted  in  position  location  and  Chay  were  visible  for  only  short 
distances  because  of  the  sand  dunes.  The  exercise  lanes  were  approxi¬ 
mately  three  by  six  kilometers.  The  major  axis  of  ee-'h  exercise  lane 
followed  one  of  the  trail*.  Position  location  proved  to  be  very  dif¬ 
ficult  end  unreliable  on  this  terrain. 

RESULTS  AND  DISCUSSION 

Target  reports  end  confirmation.  The  objective  casualty  system  ia 
a  primary  strength  of  ES  training,  as  described  in  the  introduction. 
Confirmed  casualty  information,  with  certainty  as  to  who  engaged  whom. 
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provides  immediate  aud  definite  feedback.  In  contrast  to  typical  ES 
exercises,  where  virtually  all  of  the  casualties  are  reported  by  nunber 
and  confined,  only  a  third  (31  of  104  targets)  of  the  casualties  were 
reported  by  number  and  confirmed  in  this  test  (Table  2).  Of  the  104 
targets  reported  during  the  six  exercises,  38Z  were  reported  by  ES 
maaber,  while  62Z  were  reported  by  coordinates. 

Terrain  characteristics  appeared  to  be  responsible  for  the  low 
nueber  of  target  reports  by  cumber.  The  opposing  forces  were  unable  to 
maneuver  without  being  detected  (due  to  vehicle  exhaust  8BK>ke  or  dust 
clouds)  at  long  ranges.  Thus,  vehicles  were  engaged  either  at  ranges 
beyond  those  in  which  the  nualsrs  were  legible,  or  when  sand  dunes 
obecured  the  number  panels.  The  controllers  cltod  problems  in  Identify¬ 
ing  the  ES  numbers  of  opposing  vehicles,  giving  the  engagement  distances 
ea  the  mein  reason.  Of  the  48  controllers  who  responded  to  the  ques¬ 
tion,  29  (60Z)  rsported  that  tha  enemy  vehicles  were  too  far  away  to 
re.id  the  number. 

Only  S8Z  of  the  targets  were  confirmed  (60  of  the  104  targets 
reportsd).  The  percent  of  confirmations  was  signlf lcautly  higher  for 
targets  that  were  reported  by  ES  number  (78Z)  then  were  reported  by 
coordinates  (45Z:  z  -3.18,  p<.01).  Targets  are  easier  to  confirm  whan 
they  ar*  '•ported  by  ES  number,  since  the  controller  on  the  specified 
opposing  vehicle  cen  hear  end  respond  to  ths  radio  message.  Confirm¬ 
ation  of  a  target  reported  by  coordinates  requires  that  ths  senior 
controllers  carefully  check  vehicle  positions,  and  contact  the  possible 
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TABLE  2 


Target  Reports  and  Confirmation* 


Target*  Reported 

Confirmation 


By  US  Number 

By  Coordinate* 

Total 

Ye* 

31 

29 

60 

No 

9 

35 

AA 

Total 

AO 

6A 

10A 

619 


target  vehicle*  individually  in  an  attempt  to  confirm  the  casualty. 

Lack  of  terrain  features,  and  inexperience  of  the  controllers,  made  the 
location  determinations  difficult  and  Inaccurate.  The  senior  control-' 
lers  had  to  locate  the  vehicles,  often  by  extensive  radio  use,  increas¬ 
ing  the  load  on  the  exercise  control  net  substantially  over  the  load  in 
typical  exercises.  These  additional  transmissions  taxed  the  senior 
controllers,  who  were  responsible  for  troop  command  a*  well  as  exercise 
control.  Transmission  load  on  the  control  net  degrades  the  exercises  by 
interfering  with  controller  reports  and  confirmations  of  casualties. 

Slow  or  inaccurate  removal  of  elesunta  reported  as  targets  decreases  the 
realism  during  the  exercise,  and  stakes  reconstruction  of  the  action  In 
the  AAR  less  convincing.  For  example,  since  accurate  coordinates  were 
difficult  to  determine,  the  crews  of  target  vehicles  were  not  convinced 
that  their  vehicles  were  the  ones  reported  as  targets,  especially  if 
other  vehicles  were  nearby.  Thus,  reinforcement  value  from  the  objec¬ 
tive,  definite  casualty  system  wss  decreased  in  approximately  half  of 
the  simulated  engagements  during  these  exercises. 

Results  of  incorporating  reconnaissance  functions.  Enemy  detection 
information  was  reported  over  the  exercise  control  net  in  the  first  two 
platcon  exercises.  Only  A  sightings  were  reported,  and  only  one  of 
these  was  confirmed.  These  reports  contributed  little  to  the  AAR,  while 
they  Increased  the  load  on  the  control  net.  Due  to  the  low  usefulness 
and  interference  with  the  control  net,  sighting  reports  were  discon¬ 
tinued  after  the  second  platoon  exercise. 
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The  AAR  leaders  found  cht  notes  that  they  kept  during  the  exercises 
to  be  the  most  helpful  tactical  record  during  the  controller  debrief  and 
the  AAR.  Such  notes  are  difficult  for  the  senior  controllers  to  aain- 
tain,  since  they  are  traveling  over  rough  terrain,  and  since  they  have 
the  additional  responsibility  of  functioning  as  the  unit  commander. 
Future  emphasis  will  be  on  laproving  methods  and  use  of  the  senior 
controller  notes. 

Vehicle  and  infantry  fire  teas  controller*  in  the  platoon  exercises 
were  provided  3  by  5  cards  to  record  ES  numbers  and  notes  for  the 
controller  debrief.  The  controllers  used  almost  all  of  the  cards  (111 
of  120,  or  932).  They  used  over  half  of  the  cards  to  record  the  ES 
numbers  (682),  and  just  less  than  half  to  record  notes  for  the  con¬ 
troller  debrief  (422).  About  a  third  of  the  cards  had  both  ES  numbers 
and  notes  (372).  This  field  note  card  usage  is  high  compared  to  usual 
paper  work  in  field  exercises. 

The  high  usage  rate  is  corroborated  by  the  controllers'  ratings 
(N«>48)  of  the  field  note  card  utility  (Including  use  for  both  ES  numbers 
and  notes): 

Good  582 

Fair  332 

Poor  82 

Overall,  the  controllers  (N»48)  reported  favorably  on  the  usefulness  of 
the  field  notes: 


621 


Very  helpful  44% 
Somewhat  helpful  192 
Not  helpful  6% 
Didn't  take  no tea  25 Z 
No  response  67 


During  the  aix  platoon  exercises,  91  tactical  reporta  were  recorded 
from  the  troop  tactical  radio  net.  Approximately  half  (45)  were  reporta 
of  enemy  sightings,  which  repreaent  an  averse*  of  ?,5  reporta  per  exer- 
else.  The  records  proved  too  volumlnoua  for  the  AAR  leader  to  organise 
prior  to,  or  during,  the  controller  debrief.  However,  procedures  are 
being  drafted  to  test  in  the  next  field  experiment  to  enable  the  AAR 
leader  to  use  the  tactical  radio  net  records  in  order  to  reconstruct  the 
action,  especially  as  needed  for  the  reconnaissance  functions. 

Casualty  Assessment.  Casualty  assessnent  rules,  printed  on  cards, 
were  distributed  tv  the  controllers  for  their  use  during  the  exercises. 
These  cards,  used  to  reinforce  the  casualty  assessment  training,  ap¬ 
peared  to  be  effective.  The  controllers  reported  that  they  had  no 
problems  with  casualty  assessment  (39  of  the  48  controllers  who  answered 
the  question,  or  81%,  marked  the  response  category  "no  problems"). 

Their  reports  were  consistent  with  observations  by  the  training  advisors 
and  research  personnel. 

Vehicle  Casualties  by  Mission.  Table  3  presents  vehicle  casualties 
by  mission  type.  Platoons  assigned  reconnaissance  missions  (zone  or 
route)  lost  652  of  their  vehicles,  while  platoons  with  screen  or  delay 
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Table  3 


Simulated  Engagements  by  Mission 


Mission 

Nr.  Vehicles 
Played 

Nr.  Vehicles 
"Hit” 

X  Vehicles 
’’Hit" 

Route  Recon. 

36 

22 

61X 

Zone  Recon. 

18 

13 

72X 

Recon.  Total 

54 

35 

65X 

Screen 

26 

11 

42* 

Delay 

27 

7 

26X 

Prepared  Position 

Total 

53 

18 

34X 
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missions  lose  only  34  X.  These  outcome*  appear  realistic  given  the  1:1 
force  ratios  of  the  moving  and  defending  elements.  When  equal  forces 
meet  in  battle,  the  moving  force  is  expected  to  be  at  a  disadvantage, 
compared  to  the  force  in  a  prepared  position.  The  realistic  outcome 
statistics  attested  to  the  realism  of  the  exercise  itself. 

Weapons  Effects  and  Signature  Simulators.  Procedures  and  hardware 
for  simulating  the  M551  Sheridan  main  gun,  M114  scout  vehicle  20mm 
cannon,  and  4.2  inch  (107mm)  mortar  were  evaluated.  The  M113  armored 
personnel  carrier  with  either  the  .50  caliber  machlnegun  or  TOW  were 
also  played,  but  their  evaluation  was  not  a  primary  issue  because  they 
have  been  played  in  past  ES  exercises. 

Table  4  shows  that  the  TOW  missile  inflicted  the  largest  number  of 
vehicle  casualties:  23  of  the  total  53  vehicles  "destroyed."  The  TOW 
missile  has  long  range  and  high  lethality,  and  so  accounts  for  a  high 
p(Ttion  of  the  casualties.  The  TOW  scout  vehicle  is  among  the  leading 
elements  of  the  platoon,  therefore,  it  contacts  the  opposing  force  early 
in  the  exercise.  In  this  test,  both  the  TOW  missile  and  .50  caliber 
machlnegun  mounted  on  the  same  vehicle  were  used  effectively. 

Ml  14  Scout  vehicle  with  H139  20mm  gun/ cannon.  The  20mm  cannon 
signature  was  simulated  by  an  M117  flash  simulator.  Several  of  the 
simulators  were  attached  to  a  board  on  the  K114  scout  vehicle  front,  and 
were  detonated  by  pulling  a  (.rip  wire.  The  simulator  was  easy  to  hear, 
but  did  not  ideally  represent  the  gun  signature.  Safety  was  a  major 
problem  as  noted  in  incidents  such  as  accidental  firings.  The  Mil 7  is 
an  interim  device  to  be  used  only  unta.1  a  signature  simulator  is  de¬ 
veloped  for  the  20mm  cannon. 
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Controller  optics  for  the  20am  cannon  were  fabricated  froa  the  TOW 
controller  optics.  A  10  power  telescope  was  attached  to  the  cannon 
above  the  gunner's  13  power  sight.  During  the  two  weeks  of  the  exer¬ 
cises,  threads  in  the  mounting  block  became  damaged  so  that  the  tele¬ 
scope  worked  loose  and  did  not  restain  aligned  with  the  gunner's  sights. 
Thus,  the  controller  had  a  different  sight  picture  than  the  gunner,  and 
could  not  identify  targets  properly.  The  mount  is  being  improved  to 
solve  this  problem. 

M551  Sheridan  with  152m  gun/missile  launcher.  The  same  signature 
simulator  (M116  hand  grenade  simulator)  and  controller  optics  were  used 
as  described  for  the  BHCOC  tests.  The  controllers  and  participants 
reported  favorably  on  the  hand  grenade  simulator  (e.g.,  easy  to  hear, 
realistic  uimulaeion  of  the  main  gun),  but  they  also  suggested  that  the 
signature  simulation  be  improved  as  to  loudness,  flash,  and  smoke.  When 
used  in  this  armored  cavalry  application,  the  Hoffman  device  will 
provide  these  improvements  and  the  necessary  realism. 

The  modified  missile  aft  cap,  with  a  ten  power  telescope  Inserted 
in  the  center,  proved  unsatisfactory  during  the  platoon  exercises. 

During  the  exercises,  the  aft  cap  vibrated  loose,  and  on  occasions  fell 
out  of  the  breech.  The  missile  aft  cap  has  been  further  modified  to 
correct  this  problem. 

Sheridans  contributed  relatively  little  to  the  vehicle  casualties 
(Table  4),  despite  the  long  range  and  high  lethality  of  the  main  gun. 
They  were  held  in  reserve  to  react  to  enemy  contact  rather  than  joining 
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the  casualty-producing  engagements.  This  can  be  atrributed  to  the  fact 
that  the  TOW  section  was  well  forward,  and  the  tendency  was  to  engage 
with  the  TOW  because  of  its  availability  and  to  disregard  the  recon¬ 
naissance  function. 

Hi 06  Armored  Mortar  Carrier  with  4.2  inch  (107mm)  mortar.  Two 
procedures  were  tested  to  incorporote  the  mortar  section  into  EP  exer¬ 
cises.  One  procedure  used  the  M32  pneumatic  training  device,  which 
attaches  to  the  mortar  and  shoots  a  plastic  round,  via  air  pressure. 

The  adjustment  of  air  pressure  determines  the  distance  that  the  round 
travels,  and  it  is  set  for  each  shot  to  represent  the  propellant  charge. 
Tiie  direction  of  shot  depends  upon  the  sight  deflection  settings,  just 
as  it  does  a  real  round,  A  proportional  conversion  can  be  used  to 
calculate,  from  the  plastic  round  Impact  location,  what  would  be  the 
impact  location  of  a  reel  round.  The  procedure  used  in  this  test 
simplified  the  proportional  conversion  by  laying  out  a  rope  scale  with 
proportional  distances  marked  for  the  range.  Range  was  estimated  by  a 
controller,  with  the  marked  rope  as  a  reference,  and  lateral  distance 
(deflection)  was  estimated  as  distance  from  the  rope  to  the  right  or 
left.  The  rope  scale  was  difficult  to  uss  on  the  Ft  Bliss  terrain 
because  the  send  dunes  interfered  with  placement. 

In  the  alternate  mortar  procedure,  a  mortar  controller  observed  the 
fire  direction  and  gunnery  procedures.  When  he  defected  errors,  he 
computed  an  impact  point  and  notified  the  fire  marker  to  deliver  the 
simulated  rounds  to  the  corrected  location,  rather  than  the  location 


requested  by  the  observer.  These  procedures  were  too  coaplex  for  one 
controller  (observe  two  sets  of  crew  members,  compute  lapse t  points, 
operate  the  radio,  and  record  the  procedures).  Future  tests  will 
examine  assignment  of  two  controllers,  and  sistplified  procedures. 

The  mortars,  which  remained  in  the  rear  of  the  armored  cavalry 
platoons,  were  the  vehicles  least  often  engaged  (Table  4).  However,  two 
mortars  were  hit  by  indirect  fire  from  the  opposing  force.  The  first 
was  a  preplanned  target,  which  the  mortar  had  selected  as  its  initial 
location.  The  training  value  regarding  position  selection  was  evident 
after  this  hit,  since  the  mortar  crew  had  selected  a  position  that  was 
a  major  terrain  feature  (trail  junction)  that  was  also  a  good  point  for 
an  opposing  force  preplanned  target.  The  crews  discussed  this  issue, 
and  quickly  learned  to  select  less  obvious  positions. 

Fire  Marker  Transportation.  Fire  markers,  who  deliver  the  artil¬ 
lery  burst  simulators  to  the  requested  impact  locations,  usually  travel 
in  Jeeps.  In  this  test,  an  OH-58  helicopter  was  tried  as  the  fire 
marker  vehicle.  Only  one  helicopter  was  employed  (for  safety  over  the 
small  exercise  lane),  therefore  only  one  of  the  opposing  forces  could 
have  indirect  fire  simulation  at  o-.ie  time.  The  helicopter  had  to  leave 
the  training  area  to  refuel  prior  to  the  end  of  the  exercises,  ter¬ 
minating  indirect  fire  support.  The  helicopter  was  on  station  approxi¬ 
mately  WX  of  the  exercise  time. 

The  Indirect  fire  simulation  system  produced  eleven  (11)  simulated 
vehicle  engagements,  for  213!  of  the  total  hits.  Note  that  mortar  hits 
do  not  destroy  vehicles,  but  knock  out  coamunications  and  kill  exposed 


personnel.  Therefore,  the  11  sinuiated  vehicle  engagements  did  not 
destroy  the  vehicles.  In  one  case,  a  vehicle  that  hcd  been  hit  by 
simulated  mortar  fire  early  in  the  exercise  was  destroyed  by  .50  caliber 
machinegun  fire  later  in  the  exercise.  Previous  indirect  fire  simu¬ 
lation  has  shown  a  higher  proportion  of  hits  For  example,  during  the 
REALTRAIN  validation  in  Europe  indirect  fire  accounted  for  31%  tc  32%  of 
the  personnel  and  vehicle  casualties.  Various  characteristics  of  the 
indirect  firo  simulation  in  these  exercises  at  Ft  Bliss  appeared  to 
reduce  the  mortar  effectlveaesr..  The  problems  in  use  of  the  helicopter 
were  just  described,  unit  solutions  will  be  tried  in  the  naxt  test. 

Other  reasons  for  different  indirect  fire  simulation  effects  include 
terrain,  unit  composition,  and  type  of  Indirect  fire  simulated. 

SUBJECTIVE  TRAINING  VALUE  RESULTS 

Participants  and  controllers  were  asked  for  their  subjective  eval¬ 
uations  of  the  training  value  of  the  ES  exercises,  and  how  the  ES  exer¬ 
cises  compared  with  other  training. 

Participants  (N-77)  responded  as  follow  to  the  question  "How  much 
would  you  say  you  learned  during  the  training  exercises  you  have  just 
completed?": 


A  great  deal 

44% 

Some 

38% 

Little  or  nothing 

18% 

When  asked  to  compare  the  ES  exercises  to  other  training,  mest  partici¬ 
pants  replied  that  the  ES  exercises  were  better: 
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REAITRA7N  Much  better 


36X 


REALTRAIH  better  43% 

No  difference  11X 

REALTRAIN  worse  11% 

Compared  to  the  REALTRAIN  validation  data  from  Europe,  approximately  the 
same  portion  of  the  questionnaire  answers  are  in  the  combined  "better" 
and  "much  better"  categories,  but  in  the  Europe  data,  the  majority 
responded  that  the  ES  training  was  "much  more  affective".  Some  dif¬ 
ferences  in  the  responses  may  be  due  to  scaling  and  adminlstrativf- 
differences.  It  is  possible  that  some  of  the  training  value,  or  at 
least  the  perception  of  the  training  value,  was  decreased  by  the  prob- 
lems  that  arose  in  conducting  these  armored  cavalry  exercises.  Whether 
participants  would  report  more  perceived  training  value  if  the  exercises 
were  run  better  (e.g. ,  improved  target  reporting  and  confirmation) 
remains  to  be  tested  in  future  exercises.  It  should  be  emphasised  that 
th£  armored  cavalry  exercises  entailed  development  of  a  new  system,  in 
contrast  to  the  Europe  validation  of  smoothly  conducted  training. 

Controllers  (N*»4b)  were  asked  how  much  they  learned  about  tactics 
when  they  served  as  controllers.  Responses  show  that  they  perceive  that 
they  are  learning,  often  as  much  or  more  than  if  they  are  part  of  the 
tactical  team: 

I  certainly  learned  as  much  or  more  54% 
as  a  controller,  as  I  would  have  if 
I'd  been  part  of  the  tactical  team. 

I  learned  a  fair  amount  about  tactics  33% 
while  acting  as  a  controller. 

I  dldn' t  learn  very  much  about  13% 

tactics  when  I  was  controlling. 
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Compared  to  the  RSALTRAXN  validation  in  Europe,  where  70X  of  the  con¬ 
trollers  reported  that  training  value  waa  auch  greater  for  controllers 
than  for  participants,  the  controllers  were  less  positive  concerning  the 
training  value  of  these  exercises.  These  responses  nay  reflect  the 
exercise  problems  described  above. 

SUMMARY 

This  phase  of  testing  wee  designed  not  to  produce  final  enewera  but 
rather  to  explore  end  refine  specific  ES  procedures  for  use  by  armored 
cavalry  elements.  While  not  emphasising  training  effactlvcnesa  data  at 
this  point  in  the  developmental  sequence,  perceptions  of  training  value 
ware  collected  from  participants  and  controller  personnel.  Further,  one 
could  trace  changes  in  tactical  bahavlor  ovsr  ths  series  of  exarclsss 
which  would  indicats  that  some  learning  had  occurred.  Kowevar,  thtae 
measures  do  not  represent  the  type  of  training  effectiveness  evaluation 
that  would  be  conducted  In  a  validation  study.  Performance  measures 
appropriate  for  training  ef fectlvsnase  analysis  will  be  tried  In  the 
next  field  test,  but  an  objective  training  effectiveness  analysis  must 
wait  for  the  validation. 

These  initial  test*  succeeded  in  determining  several  modifications 
necessary  for  the  controller  optics,  signature  simulators,  and  mortar 
controller  procedure*.  The  controller  duties  psrtslcing  to  casualty 
assessment  appeared  to  bw  satisfactory.  Given  the  modifications  indi¬ 
cated,  the  casualty  related  aspects  ara  raady  to  be  written  into  the 
training  program  for  armored  cavalry  ES. 
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All  of  th«  exercises  in  these  tests  contained  a  large  number  of 
simulated  engagements ,  Thus,  they  were  similar  to  typical  ES  exercises 
in  that  respect.  However,  a  special  emphasis  must  be  placed  on  recon¬ 
naissance  functions  when  dealing  with  armored  cavalry.  While  the 
procedures  for  incorporating  reconnaissance  activities  that  were  ex- 
amined  in  the  initial  tests  were  a  step  in  the  right  direction,  ad¬ 
ditional  development  is  required  to  fully  play  the  reconnaissance 
functions.  An  approach  containing  several  interrelated  techniques  is 
planned  for  the  nexi  Held  test.  First,  the  exercise  scenarios  and 
operations  orders  will  be  designed  to  limit  engagements  and  to  foster 
reconnaissance  behaviors.  When  the  simulated  engagements  are  limited, 
controllers  can  concentrate  on  observing  and  recording  the  information 
gathering  and  reporting  activities  of  the  elements  that  they  control. 

Tite  controller  records,  combined  with  records  that  appeared  to  be 
affective  in  the  initial  tests,  are  expected  to  Increase  objectivity. 
Without  such  records,  the  subjective  and  often  conflicting  judgments  of 
the  opposing  forest  constitute  the  only  basis  for  discussion.  Increas¬ 
ing  the  objectivity,  or  record*  o(  "ground  truth"  are  expected  to 
enhance  credibility,  and  in  turn  incraaaa  the  troop  isotlvation  and 
training  value.  Continued  developetent  of  armored  cavalry  ES  will  focua 
on  building  the  strengths  of  typical,  caaualty-produclng  ES  into  re¬ 
connaissance  ES  exercises.  This  revolves  around  realistic  combat 
scenarios  Involving  motivated  opposing  forces  in  aa  environment  with 
strong  psychological  fidelity.  Troops  trained  with  ES  may  not  have  been 
in  ccmbst  but  they  have  had  the  opportunity  to  learn  the  lessons  of 
combat  without  having  to  learn  the  hard  way. 
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KNOWLEDGE  TESTS  OF  MANUAL  TASK  PROCEDURES1 


The  high  coat  of  hands-on  performance  testing  tends  to  complicate 
life  for  the  developer  of  job  proficiency  tests.  He  is  urged  by  rea¬ 
sons  of  economy  to  develop  tests  th'*,  ere  administratively  feasible. 

This  usually  means  tests  that  can  be  administered  on  a  group  b^sis — 
an  interpretation  that  invariably  leads  to  paper-and-pencil  knowledge 
testing. 

We  know  that  knowledge  tests  are  appropriate  for  tasks  that  are 
essentially  mental,  sod  we  know  they  are  Inappropriate  for  tasks  that 
Involve  finely  tuned  motor  skill.  Bat  what  of  Job  tasks  in  between — 
tasks  that  involve  both  manual  and  mental  activity?  Many  job  tasks 
appear  to  be  predominantly  manual,  but  not  particularly  skilled. 

Placing  some  machines  in  operation,  assembling  objects,  installing  oi- 
repairing  components,  represent  tasks  that  are  essentially  manual, 
but  which,  if  performed  without  rigid  time  limits,  cannot  be  considered 
pcychomotor  skills.  This  is  not  to  say  that  such  tasks  require  no 
skill.  They  must  be  learned,  and  if  one  identifies  the  skilledness  of 
a  task  generally  in  terms  of  the  amount  of  practice  required  to  become 
proficient,  then  the  aforsnantionad  tasks  are  to  some  degree  skilled. 

But  the  skilled  aapact  is  probably  mental,  since  knowledge  must  be 
acquired  of  what  steps  to  perform,  in  what  order  and  with  what  result. 

It  may  be  hypothesised,  in  feet,  that  such  manual  task  procedures  can 
be  performed  with  little  or  no  practice,  if  one  knows  what,  when  and 
how  to  perform  then. 

If  there  is  something  to  this  hypothesis,  proficiency  con  be 
measured  validly  in  a  knowledge  testing  mode,  given  one  additional 
assumption:  that  the  test  medium  is  relatively  neutral  with  respect 
to  examinee  differences  in  mental  ability,  this  second  assumption  is 
necessary  because  we  are  considering  a  medium  for  testing  that  has  no 
relevance  to  the  medium  for  task  performance.  In  other  words,  we  would 
expect  someone  who  can  perform  a  task  to  be  able  to  past  a  hands tost 
of  that  taak;  but  if  that  parson  can't  road  or  writs  at  all  wall,  we 
would  be  dubious  of  their  ability  to  read  and  intarpret  written  questions 
about  taak  ptrformanct.  It  steam  important,  therefore,  when  substituting 
fox  s  hands-on  test,  that  the  substitute  medium  not  favor  one  type  of 
examinee  over  another.  We  should  strive  to  uee  test  media  that  are  nau- 
trsl  with  respect  to  tsek-lrrelevant  differences  in  abilities. 

With  this  perspective,  I  would  like  to  describe  an  experiment  In 
which  we  evaluated  the  validity  of  knowledge  tests  as  substitutes  for 
hands-on  tssta  of  manual  task  procedures. 

The  experiment  was  desigaad  to  axamina  four  methods  of  knowladga 
testing  in  tarns  of  their  relative  and  absoluts  correlation  with  hands- 
on  taak  profidancy  for  high  and  low  mental  ability  aubjacta  (So).  Tha 
specific  research  questions  of  intarest  wars: 


1  This  paper  is  baaed  on  research  dona  under  Contract  No,  DAHC  19-?*- 
C-0059  with  the  U.S.  Any  Raaaarch  Institute  for  tha  Behavioral  and 
Social  Sciencas.  Conclusions  and  opinions  axpressed  art  the  authors1, 
and  not  necassarlly  tnose  of  tha  U.S.  Amy. 
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.  Do  ch«  four  type*  of  knowledge  test  correlate  with 
h  an  da-on  taak  mastery? 

.  Do  the  types  of  teat  differ  with  reepect  to  how 
wfU  distiss’.iish  sMtcirs  fires  senses ters? 

.  Do  the  typea  of  teat  diatinguiah  masters  fro* 
nonaaatera  equally  well  for  high  and  low  aental 
ability  levela? 

.  Do  the  typea  of  teat  tend  to  produce  the  aaae 
kinda  of  errors  in  predicting  taak  mastery? 


Method 

Teat  Development.  Test''!  were  developed  for  three  Ar«y  teaks: 
Installation  of  the  Field  Telephone  (TEL),  Setting  up  a  Mechanical 
Aabush  with  the  Claymore  (AMB) ,  and  Disassembling  the  M-16  Rifle  (RTF) 
The  first  two  are  clearly  iow-sk.< lied  tasks.  Rifle  disassembly, 
however,  would  be  classified  acre  accurately  as  moderately  skilled, 
since  son*  of  the  rtepe  entail  aanlpulatione  that  are  not  easily  maa- 
tered  in  on*  or  two  trials.  Each  task  was  analysed  into  steps  on 
which  the  test  items  were  based.  In  addition  to  a  performance  (hands- 
on)  test,  four  versions  of  a  knowledge  test  were  developed  for  each 
task.  One  version  was  a  conventional  Multiple-choice  test.  The  other 
three  employed  pictures  in  an  effort  to  minimize  literacy  demands,  but 
used  different  methods  of  eliciting  teak  knowledge.  A  description  of 
the  four  tests  follows. 

•  Written  Choice  (WC) .  This  is  a  standard  multiple- 
choice  test  consisting  of  on*  question  for  each 
step  in  to*  task.  A  question  focused  on  recogni¬ 
tion  of  how  s  step  is  performed,  when  it  is  per¬ 
formed,  or  what  its  correct  outcome  is.  Alterna¬ 
tive  answers  tc  a  question  were  limited  to 
realistic  options:  unrealistic  distractcrs  were 
avoided.  The  test  was  scored  by  giving  one  point 
for  each  correct  answer;  seven  was  the  maximum 
possible  score  for  the  TEL  and  AMB  tasks,  and 
eight  the  maximum  for  RIF. 

.  Picture  Choice  (PC) .  This  method  included  She  same 
questions  as  the  Written  Choice,  but  photographs  were 
-sad  in  place  of  the  printed  word  in  presenting  answer 
alternatives.  The  possible  points  and  scoring  pro¬ 
cedure  were  the  same  as  for  WC. 
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•  Picture  Outcome  (PO) .  In  this  method  «  photograph 
of  the  result  of  an  improperly  performed  task  was 
presented.  5s  were  instructed  to  inspect  the  pic¬ 
ture  and  circle  any  errors.  This  type  of  test 
focuses  on  recognition  of  correct  task  outcome 
only.  Test  score  was  based  on  one  point  for  each 
error  circled,  minus  one  point  for  each  non-error 
circled.  Total  score  was  not  allowed  to  go  below 
zero.  The  possible  range  of  scores  was  from  0  to 
4  for  TTL  and  RIF,  and  0  to  3  for  AMB. 

.  Picture  Sort  (PS).  Photographs  of  steps  in  task 
performance,  Including  both  correctly  and  incor¬ 
rectly  executed  steps,  were  used  in  this  test 
method.  The  pictures  were  scraabled  and  presented 
to  5  with  instructions  to  select  the  correct  steps 
and  place  them  in  the  order  they  should  be  performed. 

This  method  was  considered  to  bo  the  most  compre¬ 
hensive  in  its  coverage  of  task  knowledge;  what 
steps  to  perform,  and  how  and  when  to  perform  then 
are  required  knowledge.  The  method  relies  on  recog¬ 
nition,  as  do  the  others,  but  all  task  elements  are 
tapped  and  the  guessing  factor  is  minimised.  Scoring 
was  based  on  the  award  of  one  point  for  each  picture 
or  group  of  pictures  representing  a  correct  step  per¬ 
formed  In  proper  sequence.  If  two  correct  steps  were 
in  improper  order,  credit  was  withheld  for  the  first 
step.  Steps  were  judged  to  be  improperly  sequenced 
only  if  it  were  Impossible  or  hazardous  to  perform 
them  In  that  order.  Maximum  possible  score  was  seven 
for  TEL,  and  eight  for  AMB  and  RIF. 

Subjects.  Thirty-seven  soldiers  from  units  at  Fort  Knox  were  tested. 
They  were  chiefly  from  combat  arms  MOSs  and  ranged  in  grade  from  E-2  to 
E-6.  For  the  purpose  of  study  design,  S s  were  in  two  mental  ability  (MA) 
groups;  GT  over  110  (high  MA),  and  GT  under  90  (low  MA).2  twenty  5s 
were  ir,  the  high  MA  group  and  17  were  in  the  low. 

Procedure .  On  arrival  at  the  test  site  the  project  was  explained 
briefly  to  5s.  What  was  said  to  them  took  the  following  general  form: 

We  are  working  on  a  project  to  evaluate  several  differ¬ 
ent  methods  of  testing.  You  will  take  a  hands-on  test 
for  three  tasks.  Then  you  will  take  fvur  other  kinds 
of  tests  for  each  task.  After  the  test  we  will  ask 
your  opinion  of  it.  This  Is  not  an  MOS  test,  so  thsrs 


2The  GT  (General-Technical)  is  a  combination  of  acorea  on  s  verbal 
end  a  quantitative  aptitude  test.  It  is  ccnsidsrad  to  be  the  best 
indicator  of  general  mental  ability  in  the  Army  Classification  Tast 
Battery. 
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is  no  reason  for  you  Co  bs  nervous.  But  Che  project 
is  very  important  so,  of  course,  ve  expuct  you  to  <k> 
as  veil  as  you  can  on  every  test. 

All  testing  was  done  individually  and  began  with  administration  of 
the  hands-on  test.  At  this  point  sons  5s  received  training  on  the  task 
before  going  on  to  the  knowledge  teats.  This  was  done  to  control  the 
range  of  task  mastery  within  the  two  HA  groups.  The  intention  was  to 
create  a  rectangular  distribution  of  mastery,  with  approximately  a 
third  of  each  HA  group  being  wholly  unqualified  on  a  task,  a  third 
being  partially  qualified,  and  a  third  full  maaters.  This  approach 
worked  well  at  the  full  mastery  level  since  only  one  5  could  perform  a 
task  (TEL)  without  further  training.  Thus,  ?  masters  were  created  in 
each  HA  group  by  training  them  to  pass  the  three  hands-on  tests.  The 
approach  did  not  work  as  well  within  the  nonmastery  range  since  most 
5s  could  perform  some  steps  in  the  TEL  and  RIF  tasks;  only  with  the 
AMB  task  were  any  5s  trained  to  partial  mastery. 

Once  ao  5  had  completed  the  hands-on  test  for  a  task,  he  was  given 
the  four  knowledge  tests  successively.  The  order  of  test  administra¬ 
tion  was  counterbalanced  over  5b 

In  addition  to  tast  performance,  5s  were  asked  their  opinion*'  of 
the  methods  by  having  them  rank  them  from  1  to  5  with  respect  to  the 
question:  "Do  you  think  this  test  is  a  good  way  to  find  out  if  a 
soldier  can  (task  statement)?" 

Scores  on  the  15  tests— one  hands-on  and  four  knowledge  tests  for 
each  of  three  tasks— and  5s  ratings  comprised  the  data  that  were 
analysed. 

Continuous  score  correlations  between  knowledge  test  mnd  hands-on 
performance  for  the  three  tasks  are  shown  in  Table  1  for  the  two  levels 
of  mental  ability  and  for  the  total  sample.  With  few  exceptions  the 
correlations  are  both  statistically  and  practically  significant.  They 
are  uniformly  higher,  regardless  of  test  method,  for  the  TEL  and  AMB 
tasks  than  for  RIF,  Indicating  that  rifle  disassembly  is  somehow  differ¬ 
ent  from  the  other  tasks;  a  difference  attributable  perhaps  to  a  more 
skilled  motor  component. 

Comparison  by  type  of  knowledge  test,  for  the  total  sample  and 
and  total  performance  on  the  three  tasks,  indicates  that  the  Written 
Choice,  Picture  Choice  and  Picture  Outcome  correlate  equally  well 
(.83,  .80,  and  .84  respectively)  with  kauds-on  performance.  The 
Picture  Sort  method  yields  a  somewhat  smaller  overall  relationship  (.58), 
although  the  reduction  is  attributable  to  the  near-sero  correlation  for 
the  RIF  task.  The  trend  toward  higher  correlations  for  total  score  than 
for  task  scores  reflects  a  tendency  for  Intercorrelations  among  tasks  to 
be  lower  for  a  knowledge  teat  than  for  the  hands-on  criterion.5 

^The  reader  will  recall  that,  by  desigu,  the  same  people  were  master*,  on 
all  tasks  (had  maximum  criterion  scores)  although  nonmaatars  variad  in 
degree  of  nonaamtery  from  task  to  task. 
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TABLE  1 


Total  Ptxforuocc  on  the  Thro*  Tasks 


Further  analyses  of  tha  effectiveness  of  the  different  knowledge 
teets  to  distinguish  wasters  frost  nonaasters,  both  within  and  between 
levels  of  aental  ability,  were  carried  out  by  analysis-of-variance . 

This  is  a  reasonable  way  to  examine  the  data,  since  aastery  level  was 
sere  of  st  manipulated  "tre-tasnt"  effect  than  a  natural  variate. 
Knowledge  teat  performance,  suamed  over  tasks,  of  masters  and  non¬ 
aasters  by  aentid.  ability  level  is  shown  in  Table  2.  All  test  aethods 
did  not  have  the  saae  scale  of  Measurement,  so  an  ANOV  (Winer,  1962) 
was  performed  on  each  method.  Results  of  the  four  unweighted  means 
ANOV  are  summarised  in  Table  3  and  shown  graphically  in  Figure  1.  A 
clear  and  substantial  main  effect  if  revealed  for  aaatery  level,  which 
merely  represents  the  high  correlations  between  knowledge  test  and 
task  performance  already  mentioned.  The  else  of  this  main  effect  for 
Picture  Outcome  relative  to  other  test  aethods  is  worthy  of  note.  The 
graphs  in  Figure  1  indicate  that  masters  tend  to  average  about  five 
points  higher  than  nonaasters  on  all  tests,  even  though  the  potential 
range  of  perforaence  on  PO  is  only  half  that  of  the  other  tests.  This 
would  imply  that  a  longer  test  would  produce  greater  Improvement  in 
discrimination  between  masters  and  nonaasters  for  PO  than  for  the 
other  method*. 

Performance  on  the  knowledge  tests  tended  to  be  lower  for  low 
awntal  ability  5s  than  for  high,  as  Indicated  by  the  slope  of  the 
curves  In  Figure  1.  The  difference  is  small,  and  in  fact  not  statis¬ 
tically  reliable  according  to  the  separate  ANOVs.  However,  when 
performance  was  converted  to  standard  scores  within  test  method  and 
aggregated  over  methods,  the  mental  ability  factor  is  marginally 
significant  (p  >  .05).  Moreover,  the  difference  appears  to  be  rela¬ 
tively  constant  over  test  methods  (Figure  2),  suggesting  that  no 
one  method  is  superior  in  neutralising  mental  ability  differences. 

One  of  the  mare  interesting  features  of  the  data  (Figure  1)  is 
the  trend,  however  slight,  toward  a  larger  difference  between  masters 
and  nonmasters  in  the  low  MA  group.  This  Indicates  a  slightly  higher 
correlation  between  knowledge  test  and  task  performance  for  low  mental 
ability  5s,  a  tendency  also  observed  in  Table  1  where  for  9  of  the  12 
method/ task  cosfcinatlons  the  correlation  with  mastery  was  higher  within 
the  low  mental  ability  group.  Mote  that  this  is  not  a  statistically 
reliable  phenomenon,  but  it  suggests  an  interesting  hypothesis:  know¬ 
ledge  based  tests  predict  task  performance  better  among  people  of 
moderate  to  low  mental  ability  than  among  those  of  high  mental  ability. 

Validity  in  a  strict  correlational  sense  does  not  tell  the  whole 
story,  however.  The  type  of  prediction  or  classification  error  is  of 
practical  Interest.  By  converting  knowledge  test  performance  to  pass- 
fall  scores  and  arraying  them  against  the  aaster-nonaaster  criterion, 
four-fold  tables  were  generated  from  which  the  incidence  of  false 
negative  (masters  who  failed  the  teat)  and  false-positive  (nonaasters 
who  passed  the  test)  classification  errors  were  determined.  The 
correlation  and  amount  of  classification  error,  of  course,  depend  on 
the  standard  used  in  scoring  pass-fail.  Classification  error  was 
tabulated  for  a  standard  of  full  rtstery  on  the  knowledge  teat  (pass  * 
all  items  right)  and  again  for  a  standard  of  part  aastery 


640 


TABLE  2 


KNOWLEDGE  TEST  PERFORMANCE  (MEANS  AND  STANDARD  DEVIATIONS) 

OF  MASTERS  AND  NONMASTERS  BY  TEST  METHOD  AND  MENTAL  ABILITY  LEVEL 


MASTERY  MENTAL 
LEVEL  ABILITY 

WRITTEN 

CHOICE 

TEST  METHOD 
PICTURE  PICTURE 
CHOICE  OUTCOME 

PICTURE 

SORT 

MASTERS  HIGH 

18.71 

20.28 

10.29 

18.71 

8 

2.98 

1.60 

.76 

3.25 

ft 

7 

7 

7 

7 

LOU 

'X 

17.86 

19.29 

10. 14 

17.57 

8 

1.95 

2.10 

1.21 

3.41 

N 

7 

7 

7 

7 

NONMASTERS  HIGH 

~X 

13.69 

15.00 

6.38 

15.15 

8 

2.56 

2.80 

1.61 

4.56 

N 

13 

13 

13 

13 

LOW 

y 

11.30 

13.60 

5.00 

12.00 

8 

1.83 

2.59 

1.41 

3.06 

N 

10 

10 

10 

10 

TABLE  3 


AHOV  SUMUR1ES  OF  THE  EFFECTS  OF  TASK  MASTERY  (AO 
AND  MENTAL  ABILITY  (A)  Oil  KNOWLEDGE  TEST  PERFORMANCE 


TEST  METHOD 

SOURCE 

SS 

df 

MS 

WRITTEN  CHOICE 

M 

289. 853 

1 

289.853 

46.10** 

A 

22.691 

1 

22.691 

3.61 

H  x  A 

5.126 

1 

5.126 

.82 

Error 

207.466 

33 

6.287 

PICTURE  CHOICE 

N 

260.120 

1 

260.120 

43.73** 

A 

12.257 

1 

12.357 

2.08 

M  x  A 

.364 

1 

.364 

.06 

Error 

196.273 

33 

5.948 

PICTURE  OUTCOME 

H 

177.034 

1 

177.034 

95.38** 

A 

5.060 

1 

5.060 

2.73 

N  x  A 

3.271 

1 

3.271 

1.76 

Error 

61.2483 

33 

1.856 

PICTURE  SORT 

H 

180.178 

1 

180.178 

12.73** 

A 

39.781 

1 

39.781 

2.81 

H  x  A 

8,733 

1 

8.733 

.62 

Error 

46b, 9 39 

33 

14.150 

** 

p  <  .01 


642 


FIGURE  2.  Mean  standard  score  performance  of  high  and  low 
aental  ability  (HA)  groups  for  the  four  knowledge 


(pee*  ■  no  more  than  one  Itcn  wrong) .  The  ruuki  arc  shown  in  Tabic  4 
for  high  and  low  aental  ability  groups  and  for  the  total  cample.  With 
exception  of  tha  Picture  Out  cone  Method,  classification  error  is  sow- 
what  less  using  the  more  liberal  part  aaatery  criterion  on  the  knowledge 
tests.  Total  error  tended  to  run  about  252  on  the  average,  reaching  a 
low  of  162  for  the  Picture  Choice  Method  with  the  criterion  t>f  part 
nastery.  Of  particular  interest  is  the  distribution  of  total  error 
between  fsxae-posltlve  and  false-negative  categories.  As  tha  standard 
for  passing  a  predictor  Measure  is  relaxed,  tha  nuabar  of  false- positive* 
generally  increases.  The  optiaal  ratio  of  the  two  types  of  error  is  a 
Moot  point,  and  will  depend  largely  on  how  test  scores  are  to  be  used. 

But  if  test  fairness  is  the  goal,  then  miniairing  the  nuabar  of  false- 
negatlvea  should  be  the  objective.  The  relative  nusber  of  false-nagativee, 
Moreover,  should  be  the  saae  for  groupe  differing  in  aental  ability 
(or  any  other  ability  correlated  with  test  score  but  unrelated  to  cri¬ 
terion  perforaence) .  Comparing  high  and  low  U.  groups  we  find  a  aaall 
but  consistent  tendency-  toward  more  fslse-positivea  among  the  high  MA's, 
and  sort  falaa-nagativae  among  the  low.  This  trend  was  evaluated  by 
Chi-square  analysis  of  tits  diffarence  in  type  of  classification  srror 
between  high  and  low  HA  groups,  and  la  shown  in  Table  5  by  test  Method 
for  ecch  standard  of  teat  "mastery."  Observed  Chi- a qua res  weru  tested 
at  tha  102  law l  of  significance,  which  provides  for  a  conservative 
decision  with  respect  to  accepting  the  null  hypothesis  of  no  difference 
between  groups  in  distribution  of  classification  error.  Type  of  classi¬ 
fication  error  produced  by  the  knowledge  tests  doss  appear  to  interact 
with  aental  ability.  Although  tha  nuabar  of  cease  underlying  the 
analysis  era  too  few  to  warrant  firm  conclusion,  indications  sra  that 
if  one  ware  interested  in  sin*  nixing  tha  incidence  of  falae-ntgatlvas 
(i.e.t  the  pert  nastery  standard),  the  Picture  Choice  Method  produces 
the  nost  equitable  results  for  both  aental  ability  groups. 

Peraonal  Preferences  for  Test  Hethoda.  5s'  opinions  of  tha  teat 
Method*  were  solicited  after  each  test  was  administered  end  again  whan 
all  tasting  was  concluded.  Responses  at  the  two  points  in  time  ware 
similar,  so  only  the  final  ratings  are  reported  here.  >s  ware  asked 
to  rank  the  five  Methods  (Including  the  hands-on  criterion  teat)  from 
highest  to  lowest  In  terae  of  the  question,  "Do  you  thllhk  this  teat  is 
a  good  way  to  find  out  if  a  soldier  can. . „ !e.g. ,  sat  up  a  Mechanical 
aabush  with  a  Clayaore?")  Rankings  were  -..one  separately  for  each  task. 
Overall  Mean  prafaranca  was  highest  for  the  hands-on  Method  of  per¬ 
formance  testing,  e*  might  be  expected  (Tables  6  end  7).  Differences 
in  preference  for  the  four  Methods  of  knowledge  testing  were  less 
pronounced,  although  the  Ficture  Choice  consistently  recaived  higher 
everega  ranking  regardless  of  the  referent  task  or  rating  eubgroup. 

Overall,  the  hands-on  method  was  first.  Picture  Choice  second,  Picture 
Sort  third.  Picture  Outcome  fourth,  end  Written  Choice  lest  in  average 
order  of  preference. 
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TABLE  4 

AVERAGE0  PERCENT  CLASSIFICATION  ERROR  AS  A  FUNCTION 
OF  KNOWLEDGE  TEST  METHOD  AND  LEVEL  OF  MENTAL  ABILITY 


KNOWLEDGE  TEST  METHOD 


TEST 

STANDARD 

MENTAL 
ABILITY 
CROUP  . 

WC 

PC 

PO 

PS 

FN 

FP 

to; 

1  FN 

CLASSIFIC 
_ POT 

urioN 

1  n 

ERROR27 

FP _ TOlJ 

1  FN 

FP 

TO’ 

FULL 

HIGH 

18 

05 

23 

11 

07 

20 

05 

15 

20 

23 

03 

26 

MASTERY 

LOW 

27 

02 

29 

25 

04 

29 

16 

08 

24 

33 

00 

33 

TOTAL 

22 

04 

26 

19 

05 

24 

10 

12 

22 

28 

02 

30 

PART 

HIGH 

07 

13 

20 

02 

13 

15 

00 

32 

32 

15 

17 

32 

MASTERY 

LOU 

18 

02 

20 

08 

10 

18 

04 

20 

24 

22 

02 

24 

TOTAL 

12 

08 

20 

04 

12 

16 

02 

26 

28 

18 

10 

28 

Averaged  over  the  three  tuk*. 

FN  "  False  Nagativaa  (ass Cars  who  failed  knowledge  teat) 

FP  •  False  Positives  (nonaaetera  who  passed  knowledge  test) 
TOT  •  Total  Classification  Error 


TABLE  5 

CHI  SQUARE  OF  THE  DIFFERENCE  IN 
TYPE  OP  CLASSIFICATION  ERROR  BETWEEN 
HIGH  AND  LOW  MENTAL  ABILITY  GROUPS 
BY  TEST  STANDARD  AND  TEST  METHOD 


TEST 

STANDARD 

FULL 

MASTERY 

PART 

MASTERY 

*p  <  .10 


KNOWLEDGE  TEST  METHOD 


WC 

PC 

PO 

PS 

1.33 

1.54 

4.19* 

2.26 

7.22* 

2.49 

3,38* 

6.30* 

yj  ^jw***- 


TABLE  6 


MEAN  ORDER  OF  PREFERENCE0  BY  TASK  FOR 
THE  HANDS-ON  AND  KNOWLEDGE  TEST  METHODS 


TASK 

HANDS-ON 

TEST  METHOD 
WC  PC 

PO 

PS 

TEL 

1.14 

3.92 

3.03 

3.58 

3.33 

AMB 

1.25 

3.78 

2.94 

3.72 

3.31 

RIF 

1.08 

3.67 

3.14 

3.47 

3.64 

a  The  lower  the  nuaber  the  higher  the  preference. 


TABLE  7 

MEAN  ORDER  OF  PREFERENCE  BY  SUBGROUP 
FOR  THE  HANDS-ON  AND  KNOWLEDGE  TEST  METHODS 


SUBGROUP 

HANDS-ON 

TEST  METHOD 
WC  PC 

PO 

PS 

MASTERS 

1.00 

3.85 

3.08 

3.38 

3.69 

NON- 

MASTERS 

1.30 

3.83 

2.65 

3.87 

3.35 

HIGH  HA 

1.35 

4.10 

2.65 

3.70 

3.20 

LOW  MA 

1.06 

3.50 

3.00 

3.69 

3.81 

TOTAL 

1.19 

3.83 

2.81 

3.69 

3.47 
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Discussion 


A  number  of  interesting  though  tentative  findings  emerged  from 
this  study,  the  saftii  »«»1«  of  people  and  tasks  certainly  Units 
generality  of  the  results,  end  the  following  interpretation  and  con* 
elusions  should  be  so  t saps red. 

The  date  strongly  support  the  hypothesis  that  performance  on 
annual  task  procedures  is  aedlatod  by  knowledge.  Correlations  between 
task  knowledge  and  task  performance  were  high*  particularly  for  the 
two  procedural  tasks  with  the  lowest  skill  requirements.  The  correla¬ 
tions  reached  as  high  as  .75  in  spite  of  the  fact  that  the  range  of 
possible  test  performance  seldom  exceeded  seven  points.  When  perform¬ 
ance  was  aggregated  over  tasks,  the  correlations  tended  to  be  more 
on  the  order  of  .80. 

Substantial  differences  among  methods  of  knowledge  testing  were 
not  found.  The  conventional  written  multiple-choice  test  did  essen¬ 
tially  ao  well  as  the  pictorially  based  methods  in  distinguishing 
masters  from  nonmasters.  (In  this  connection,  however,  it  should  be 
noted  that  test  questions  were  carefully  directed  at  steps  necessary 
in  task  performance,  aud  did  not  Include  those  marginally  relevant 
knowledge  items  often  found  on  such  tests.)  Failure  of  the  Picture 
Sort  teats  to  correlate  higfer  with  performance  was  an  unexpected 
result.  This  method  was  designed  to  tap  more  fully  all  knowledge 
aspects  of  task  performance,  ’noiudlng  recognition  of  the  steps, 
their  correct  outcome,  end  sequence.  In  so  doing,  however,  it  nay 
well  have  become  the  nos;  demanding  test  technique  from  the  steed- 
point  of  method-specific  mediation  requirements;  that  is,  the  examinee 
must  first  analyse  what  he  done  in  performing  the  teak,  end  then 
synthesise  it  e  step  at  s  tins  by  sorting  through  s  lsrgs  miabsr  of 
pictures  more  or  lsss  representative  of  hla  ssntal  images  of  ths  task. 

That  kind  of  abstract  ?  stipulation  probably  tsxss  ths  intelisctual 
and  visualisation  abilities  more  than  ue  originally  anticipated.  In 
support  of  this  speculation,  there  wee  some  indication  that  S*  in  the 
low  mental  ability  group  had  more  trouble  with  thin  test  method  then 
with  others  (Figure  1).  The  written  and  pictorial  multiple-choice 
tests,  though  more  dependent,  on  literacy,  represent  e  culturally 
familiar  method.  The  Picture  Outcome  method  appears  to  be  the  simplest 
in  the  sanaa  of  minimising  both  literacy  and  method-specific  mediational 
demands,  and  la  certainly  worthy  of  further  study  and  development  as  an 
efficient  method  of  knowledge  testing. 

Correlations  between  knowledge  and  performance  were  not  significantly 
different  for  high  versus  low  mental  ability  Ss.  Yet  there  wee  e  slight 
but  noticeable  trend  toward  larger  correlations  within  ths  low  msntsl 
ability  group.  Ths  possibility  that  knowledge  measures— including  the 
standard  multiple-choice  teat — ere  better  predictors  of  task  mastery 
for  those  of  below  average  mental  ability  is  intriguing.  If  true,  we 
need  to  reevaluate  the  popular  notion  that  knowledge  teste  of  manual 
performance  ere  unfair  to  thosa  less  apt  In  the  academic  skills  of 
reeding,  writing  end  syafcol  manipulation.  The  notion  is  probably  valid, 
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but  it  My  b«  so  for  reasons  quit*  different  tb^r«  normally  of  fared. 
Knowledge  teats  apparently  are  good  predictors  of  perfomance  on 
low-skill  procedural  tasks  among  people  of  low  to  moderate  mental  . 
ability.  The  unfairness  lies  not  in  the  inability  of  this  group  to 
use  a  knowledge  testing  medium,  but  in  the  tendency  cf  brighter 
people  to  over  use  it.  The  hypothesis  here  is  that  some  minimum 
level  of  ability,  whether  innate  or  acquired,  is  necessary  to  handle 
the  symbolic  and  semantic  demands  of  a  knowledge  test;  but  beyond 
that  level,  correlated  factors  such  as  test-vlseness  begin  to  moder¬ 
ate  the  true  relationahlp  between  task  knowledge  and  perfomance. 

Two  additional  features  of  the  data  tend  to  support  this  speculation: 
a)  higher  average  knowledge  test  scores  for  the  high  Mntal  ability 
group,  and  b)  relatively  more  false-positive  errors  in  predicting 
mastery  among  this  group. 

If  one  were  urged  to  recommend,  on  the  basis  of  this  study,  a 
method  of  testing  knowledge  on  low-skill  procedural  tasks,  the  Picture 
Choice  would  probably  have  to  be  named.  The  data  aro  certainly  not 
conclusive,  but  this  method  came  the  closest  to  meeting  the  overall 
validity  criteria:  it  demonstrated  a  high  correlation  with  hands-on 
task  perforMcca;  the  correlation  was  relatively  constant  over  the 
range  of  mental  ability j  and,  the  distributions  of  classification 
error  were  more  nearly  proportional  for  the  two  levels  of  mental 
ability.  Moreover,  the  Picture  Choice  Mthod  was  second  only  to  the 
hands-on  test  in  examinee  prefer+nce. 
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Differential  Prediction  from  an  Unexpected  Source 
Andrew  N.  Dow,  fd.u. 

NETPDC,  Ellyson,  Pensacola,  Florida 


During  the  course  of  a  validity  study  that  Is  reported  else¬ 
where  (Dow,  1977),  some  previously  unsuspected  patterns  seemed  to 
emerge  from  the  data.  If  the  validation  study  had  been  run  entirely 
by  computer,  ani  the  results  picked  up  from  print  outs,  there  Is  a 
good  chance  that  these  patterns  would  have  remained  undetected.  How¬ 
ever,  circumstances  were  such  that  It  was  more  convenient  to  use  a 
small  programmable  calculator,  the  Monroe  326,  than  to  arrange  pro¬ 
gramming  and  runs  on  the  computer.  While  the  Monroe  326  calculates 
like  a  computer  from  a  stored  program,  each  bit  of  data  is  entered 
by  hand,  and  each  result  Is  read  from  a  visual  display  and  recorded 
by  hand.  It  was  while  hand-recording  results  that  the  author  noticed 
the  emerging  patterns. 

The  test  that  was  the  object  of  the  validity  study  was  the 
communication  section  of  the  U.  S.  Navy's  advancement  examination 
administered  to  candidates  for  advancement  to  E-8  ar.d  E-9  rates. 

This  section,  or  rather,  these  sections,  as  there  Is  a  different  one 
for  each  paygrade  and  advancement  cycle,  are  based  upon  a  technique 
and  structure  devised  by  Haney  (19S3,  1955,  and  1958)  and  called  the 
Urcrltical  Inference  Test. 

Haney's  Uncritical  In*erence  Tests  consist  of  a  short  story 
of  40  to  200  words  followed  by  a  series  of  true- false-?  test  ques¬ 
tions.  The  directions  tell  the  person  taking  the  test  to  read  the 
brief  story,  and  accept  It  as  true  and  accurate.  Then,  If  necessary, 
re-read  the  story,  and  respond  to  the  true-false-?  items  In  order. 

An  answer  of  "7"  means  that  on  the  basis  of  the  story,  the  statement 
Is  INFINITELY  TRUE,  an  answer  of  ’FH  means  that  on  the  same  basis, 
the  statement  is  DEFINITELY  FALSE,  an  answer  of  "?M  means  that,  on 
the  basis  of  the  story,  you  can  not  be  definitely  certain  about  the 
answer. 

During  the  development  of  the  test,  Haney  checked  both  the 
reliability  and  uniqueness  of  the  trait  that  he  was  presuming  to 
measure,  lor  forms  A  and  B  respectively,  he  found  split-half  corre¬ 
lations  of  .762  and  .818;  when  corrected  by  the  Spearman-Brown 
technique,  these  increased  to  .928  and  .947.  The  correlations  from 
the  test-retest  method  were  slightly  lower,  runnina  ,67  when  form  A 
was  followed  by  form  B,  and  .56  when  form  8  was  followed  by  fom  A. 
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Correlations  with  reading  comprehension  and  general  ability  tests 
ranged  froa  .20  to  .33.  From  these  figures,  it  is  possible  to  con¬ 
clude  that  the  Uncritical  Inference  Test  was  measuring  a  definite, 
unique,  independent  trait. 

As  mentioned  in  orevious  works  of  this  author,  (Dow  1977, 

Macaluso  and  Dow  1969},  a  comnunication  section  was  incorporated  in 
each  exam  for  advancement  to  E-8  or  E-9.  These  comnunication  sections 
were  patterned  after  the  Uncritical  Inference  Test,  and  always  con¬ 
tained  exactly  20  of  the  True-False-?  items.  On  a  purely  face-vali¬ 
dity  basis,  a  test  of  this  type  seems  to  measure  an  aspect  of  being 
a  military  supervisor.  However,  this  type  of  validity  is  not  easily 
verified;  the  mathematics  is  simple,  but  the  criterion  is  elusive, 
to  say  the  least.  Therefore,  the  author  attempted  to  validate  the 
communication  section  against  the  decisions  of  the  several  boards 
that  select  the  candidates  who  are  to  be  advanced.  It  was  while  he 
was  working  on  the  data  for  the  validation  study  that  the  author 
noted  that  some  ratings  tended  to  have  higher  scores  than  others,  and 
that  the  trends  seemed  to  be  rather  consistent. 

As  a  first  step,  a  series  of  correlations  were  run,  using  the 
ratings'  mean  communication  raw  score  separately  for  selectees  or 
nonselectees  In  a  paygrade  and  series  as  the  raw  data.  For  example, 
one  correlation  was  between  the  mean  scores  of  series  65,  E-8  non¬ 
selectees  (65-8-NON)  and  those  of  series  68,  E-8  selectees  (68-8-SEL). 
Another  correlation  was  between  series  68  E-8  selectees  (68-8-SEL) 
and  series  68  E-9  (68-9-HON)  nonselectees.  These  various  correla¬ 
tions  are  listed  In  tables  1,  2,  and  3. 

When  tables  1,  2,  and  3  are  reviewed,  it  is  noted  that  the  coef- 
eficients  of  correlation  range  from  a  high  of  .972  to  a  low  of  .527. 
Table  4  is  a  frequency  distribution  of  the  coefficients  found  In 
tables  1,  2,  and  3;  note  that  only  one  coefficient  Is  smaller 
than  .650,  and  that  the  median  value  is  .812.  The  calculated  arith¬ 
metic  mean  is  .808- -rounded  to  two  figures,  the  mean  and  median  agree 
at  .81.  Further  note  that  five  of  the  coefficients  are  greater 
than  .90;  a  total  of  23  of  them  are  larger  than  .75. 

Before  discussing  the  Implications  of  these  rather  large  corre¬ 
lation  coefficients,  other  facts  must  be  put  on  record.  First, 
series  65  is  the  earliest  of  the  three  exam  cycles,  71,  the  most 
recent.  Secondly,  persons  who  were  not  selected  (NONs)  in  a  given 
cycle  may  participate  in  the  next  cycle,  and  others  until  he  is 
selected;  this  means  that  NONs  from  a  cycle  will  be  Included  In  both 
the  NONs  and  SELs  of  the  following  cycle  (series),  at  the  paygrade. 

The  actual  percent  of  overlap  of  personnel  is  not  known,  but  is 
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TABLE  1 


r 


X  DATA 

Y  DATA 

vy 

65-e-NON 

68-8-SEL 

,776 

65-8-NON 

68-8-NON 

.972 

65-8- NON 

71-8-SEL 

.819 

65-8-NON 

71-8-NON 

.840 

65-8-NON 

65-8-SEL 

.783 

65-8- SEL 

68-8-SEL 

.681 

65-8-SEL 

68-8-NON 

.768 

65-8-SEl 

71-8-SEL 

.727 

65-8-SEL 

71-8-NON 

.692 

68-8  NON 

71-8-SEL 

.901 

68-8-NON 

71-8-NON 

.905 

68-P-NON 

68-8-SEL 

.763 

68-8-SEL 

71-8-SEL 

.803 

68-8-SEL 

71-8-NON 

.724 

71-8-SEL 

71-8-NON 

.786 

Correction  Coefficients  Betvren  the  Scores  Achieved  by  the 
Several  Specialties  in  the  Various  Groups  of  E-8  Candidates 


X  DATA 

Y  DATA 

xy 

68-9-SEL 

71-9-SEL 

.844 

68-9-SEL 

71-9-NON 

.812 

68-9-SEL 

68-9-NON 

.756 

68-9-NON 

71-9-SEL 

.898 

68-9-NON 

71-9-NON 

.948 

71-9-SEL 

71-9-NON 

.893 

[ 
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Correlation  Coefficients  Between  the  Scores  Achieved  by  the  Several 
Specialties  In  the  Various  Groups  of  E-9  Candidates  from  Series  A 
and  B,  Only 


TABLE  3 


r 


X  DATA 

Y  DATA 

*y 

68-8-SEL 

68-9-SEL 

.527 

68-8-SEL 

68-9-NON 

.771 

68-8-SEL 

71-9-SEL 

.816 

68-8-SEL 

71-9-NON 

.837 

68-8-NON 

68-9-NON 

.878 

68-8-NON 

68-9-SEL 

.721 

68-8-NON 

71-9-SEL 

.862 

68-8-NON 

71-9-NON 

.923 

Correlation  Coefficients  Between  the  Scores  Achieved  by  the  Several 
Specialties  In  the  Various  Groups,  Across  Paygrades  using  Candidates 
in  Series  A  and  B,  Only 
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TABLE  4 


Correlation  Coefficients  N 


.950 

- 

.999 

1 

.900 

- 

.949 

4 

.850 

- 

.899 

4 

.800 

- 

.849 

7  -f—  .812  Median 

.750 

- 

.799 

7 

.700 

- 

.749 

3 

.650 

- 

.699 

2 

.600 

- 

.649 

0 

.550 

- 

.599 

0 

.500 

. 

.549 

1 

Frequency  Distribution  of  Correlation 
Coefficients  Found  In  Tables  1,  2,  and  3 
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assumed  to  be  rather  high— over  50*  in  some  cases.  In  addition  to  the 
overlap  caused  by  reparticipation  of  nonselectees,  some  of  those 
Scl ictcu  for  advsnceme  at  to  f-8  in  cycle  65  caused  a  further  overlap 
by  participating  in  the  cycle  71  E-9~  exams.  However,  because  of  other 
data  difficulties,  there  were  no  correlations  involving  the  65-8-SEl 
to  71-9-NON  or  71-9-SEl  group. 

Note  that  four  of  the  five  coefficients  greater  than  .90  are 
between  successive  series  groups  at  the  same  paygrade,  with  the  early 
group  a  NON;  pairs  such  as  these  probably  have  a  high  overlap  of  parti¬ 
cipants.  Even  though  these  are  group  means  rather  than  individual's 
scores,  these  large  coefficients  probably  indicated  acceptable  test- 
retest  reliability  for  reasonably  equivalent  forms  of  the  test.  These 
tests  {or  sections)  are  not  equated,  and  their  mean  raw  scores  differ 
noticeably,  therefore  we  cannot  call  them  equivalent.  However, 
because  they  are  of  the  same  structure,  and  because  they  do  seem  to 
predict  one  another,  they  can  be  called  reasonably  equivalent  forms. 

From  a  different  angle,  the  eight  coefficients  that  are  between 
groups  with  overlap  range  from  .972  down  to  .776,  with  a  median  of  .90. 
The  21  coefficients  that  were  calculated  between  non-overlapping 
groups  range  from  a  .923  down  to  .527  with  a  median  of  .78.  The  differ¬ 
ences  between  the  two  kinds  of  groups  Indicate  that  the  test  will 
predict  the  score  on  the  retest.  These  rather  large  coefficients  of 
correlation  show  that  there  Is  a  rating-specific  trait  that  is  being 
measured  by  tho  communication  subtest  of  the  E-8  and  E-9  advancement 
examinations. 

While  no  detailed  data  are  presented  in  this  paper,  the  author 
noted  that  similar  ratings  had  similar  means  (high,  low,  or  In  be¬ 
tween).  This  opens  the  possibility  that  the  trait  is  not  exactly 
rating-predictive,  but  Is  related  to  an  occupational  area.  The  pos¬ 
sible  existence  of  rating  groups  or  clusters  should  be  investigated. 

As  the  original  work  was  done  with  a  small  group  of  ratings,  the 
study  should  be  repeated  for  all  Navy  rations.  Also,  the  original 
study  used  the  records  only  of  those  candidates  that  scored  high 
enough  to  be  considered  by  the  selection  board,  all  candidates  should 
be  included  in  this  new  study. 

If  further  studies  do  confirm  that  a  reliable  measurable  trait 
does  exist,  and  that  It  is  rating  specific  or  interest-area  specific, 
then  it  becomes  necessary  to  find  out  whether  the  trait  pre-exists  In 
these  persons,  or  It  develops  during  the  years  that  they  have  worked 
in  their  ratings,  Haney  (1958)  found  that  experienced  policemen 
scored  no  higher  on  the  Uncritical  Inference  Test  than  police  rookies 
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did.  He  had  assumed  that  several  years  of  police  experience  would 
make  a  person  more  critical  of  the  Inferences  he  draws  from  written 
material. 


If  It  is  found  that  the  trait  pre-exists,  and  Is  not  developed 
by  specific  Navy  experiences,  then  it's  use  as  a  differential  selec¬ 
tion  device  should  be  pursued.  As  a  rough  guess,  a  predictive  instru 
ment  would  necessarily  have  more  than  twenty  questions;  very  likely 
there  should  be  several  short  stories,  each  with  20  to  30  questions. 
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ARMY  SKILL  QUALIFICATION  TEST 
Or  Clay  Brittain 


US  Army  Training  Support  Center,  Fort  Eustis,  VA 

T1»is  is  the  third  year  running  in  which  we  have  had  a  Military  Testing 
Association  (MTA)  session  devoted  to  SQT.  Two  years  ago,  we  appeared 
under  the  organizational  rubric  of  the  Army  Enlisted  Test  Activity  for  ETA). 
Last  year,  the  name  was  Individual  Training  Evaluation  Group  (or  ITEG) . 

Now  we  are  Individual  Training  &  Evaluation  Directorate  (or  ITED)  of  the 
Army  Training  Support  Center. 

One  might  surmise  that  name  changes  a«  frequent  as  this,  reflects 
more  than  the  usual  bureaucratic  proclivity  for  shifting  around  boxes 
and  charging  labels  on  an  organizational  chart  and  may  be  indicative  of 
an  identity  problem,  i.e.,  that  we  are  still  trying  to  find  ou“  what  we 
are  about.  This  is  not  the  case.  Our  goal  is  essentially  the  same  now 
that  it  was  last  year  and  the  year  before  that;  namely,  to  implement  an 
approach  to  occupational  proficiency  testing  which  goes  beyond  the  measure¬ 
ment  of  job-knowledge  to  the  assessment  of  job-competence.  We  recognize 
that  such  a  goal  is  not  unique  to  ITED,  or  to  the  Army.  But  the  SQT 
program  may  be  unique  in  the  degree  of  commitment  and  the  magnitude  of 
the  effort  it  represents  toward  realization  of  this  goal. 

A  premise  wh*ch  in  bi&ic  to  the  SQT  program  is  that  testing  nerves 
powerfully  to  stimulate  and  discipline  individual  training.  Soldiers 
are  strongly  motive: ed  to  acquire  and  supervisors  to  train  those  compe¬ 
tencies  which  are  to  be  tested.  Thus,  the  SQT  program  has  been  designed 
not  simply  to  serve  personnel  management  needs  but  to  give  leverage  fur 
focusing  and  enhancing  individual  training.  The  strong  emphasis  in 
the  SQT  program  on  critical  tasks  tested  realistically  derives  from  this 
basic  aim  of  insuring  that  testing  is  relevant  to  effective  training.  The 
present  symposium  reflects  this  point  of  view. 

As  a  prelude  to  the  symposium,  wo  have  had  on  display  for  the  past 
two  days  an  exhibit  on  the  SQT  program.  The  display  included  components 
of  the  Individual  TtAlning/Evaluatic  <  System  in  which  the  SQT  is  embedded. 

It  may  be  useful  here  to  briefly  review  this  system. 

Tns  initiation  of  SQT  development  presumes  a  comprehensive  job  analysis 
which  identifies  the  job-tasks  critical  in  an  MOS;  and  for  each  task  a 
thorough  task  analysis.  The  results  of  job  and  task  analyses  are  incor¬ 
porated  into  a  Soldier '3  Manual.  The  Soldier's  Manual  defines  for  soiJiers 
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the  HOS.  It  list*  the  tasks  which  have  been  identified  as  critical  to  the 
soldier's  job  and  for  each  task  gives  a  description  which  includes  a 
statement  of  the  conditions  under  which  the  task  is  to  be  perforated  and  of 
the  standards  which  define  satisfactory  performance.  Moreover,  the 
Soldier's  Manual  delineates  the  domain  of  the  SQT;  i.e.,  the  SQT  samples 
those  job  teaks  listed  in  the  Soldier's  Manuals.  Thus,  the  Soldier's 
Manual,  in  effect,  says  to  the  soldier  "If  you  are  to  be  competent  in 
your  MOS,  these  are  the  taskj  you  must  be  able  to  perform.  In  assessing 
your  occupational  corpetence,  we  will  not  go  outside  the  tasks  listed  here1! 
The.  Soldier's  Manual  must  be  in  the  h«nds  of  soldiers  at  least  six  months 
prior  to  the  time  they  take  the  SQT. 

Sixty  days  or  more  prior  to  the  SQT,  the  soldier  receives  an  SQT 
Notice.  This  document  identifies  the  specific  tasks  to  be  tested  and 
for  each  task  specifies  the  SQT  component.  That  is,  the  soldier  is  told 
-  whether  the  task  is  to  be  tested  in  the  written  component  (WC) ,  the  hand- 
on  component  (HOC),  or  the  performance  certification  component  (PCC).  In 
selecting  tasks  for  the  SQT,  the  aim  is  to  test  those  tasks  or  which 
performance  deficiencies  are  most  prevalent  and/or  most  serious.  The 
purpose  in  issuing  an  SQT  Notice  in  fairly  close  proximity  to  the  test 
period  is  to  focus  individual  training  efforts  upon  those  tasks  most 
in  need  of  training. 

SQT  results  arc  reported  to  soldiers  and  to  various  echelons  of 
command.  The  soldier  gets  an  individual  soldier's  report  (ISR)  which 
identifies  the  tasks  on  which  he  (or  she)  failed  and  gives  an  overall 
SQT  score.  Comoianders  -  from  the  battalion  to  the  major  command  level 
receive  reports  which  show  in  aggregate  how  soldiers  in  their  command 
did  on  each  task  tested;  i.o.,  pass/fail  percentages  for  each  task. 

The  aim  is  to  provide  to  each  level  of  comaiand  information  useful  in 
managing,  supporting,  and  facilitating  individual  training. 

The  SQT  program  is  being  implemented  on  s  schedule  which  will  bo 
completed  in  about  two  years.  Skill  Qualification  Testing  for  record 
began  last  April  with  the  testing  of  soldiers  in  Career  Management  Field 
(CMF)  11  (Maneuver  Coatoat  Arms).  The  testing  of  soldiers  in  CMF  16 
(Air  Defense)  and  CMF  95  (Military  Police)  began  in  July.  The  testing  of 
soldiers  in  OfF  76  (Supply)  began  this  month.  With  the  phase-in  of 
additional  CMF  each  quarter,  SQT  will  have  been  implemented  for  all 
enlisted  CMF  in  the  Army  in  the  fir3t  quarter  of  FT  1950. 

With  this  brief  background  statement  on  the  SQT  program,  we  now 
turn  to  the  present  symposium,  which  falls  somewhat  logically  Into  two 
sections.  In  the  first  two  presentations,  SQT  developed  by  the 
Military  Police  School  and  by  the  Air  Defense  Artillery  School  will  be 
described  and  discussed.  Following  these  presentation  with  a  slight  shift 
in  perspective,  we  will  present  and  examine  some  early  SQT  results, 
discuss  our  experiences  with  performance  testing,  describe  the  training 
of  SQT  developers,  and  reflect  on  some  of  our  problems  and  lessons  learned. 
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Use  of  Video-Tape  for  the  Military  Police  Skill  Qualification  Test  (SQT) 


Daniel  E.  Spector,  PhO,  Chief,  SQT  Development  Branch 
EPMS  Division,  Directorate  of  Training  Developments, 
USAMPS/TC,  Fort  McClellan,  Alabama  36205 


Many  military  police  tasks  involve  quick  decisions  in  reaction  to 
what  a  policeman  or  woman  sees  or  hears.  Conventional  paper  and  pencil 
tests  have  measured  ability  to  perform  these  tasks  by  presenting  a  detailed 
written  situation  and  requiring  the  examinee  to  choose  the  appropriate  action 
for  responding  to  the  situation.  As  the  task  cues  are  visual  and/or  audial, 
presentation  of  a  "word-picture"  involves  a  serious  drop  in  test  fidelity. 
Responding  to  a  written  situation  is  simply  not  the  same  as  responding  to  a 
dynamic  visual  problem  presented  with  sound.  Moreover,  the  test  developer  r 

can  never  be  sure  the  word  picture  conjures  up  the  same  image  in  every  examinee's 
mind.  Finally,  a  written  test  cannot  require  a  real  time  response.  The 
examinee  can  ponder  the  situation  or  go  on  to  anotlsr  question  in  hope  of 
a  hidden  cue  or  flash  of  brilliance.  There  is  no  such  luxury  in  most  real 
world  police  tasks. 

The  US  Army  Military  Police  School  is  overcoming  some  of  these 
problems  through  us*  of  video-tape  in  the  Skill  Qualification  Test  (SQT) 
program.  If  a  task  involves  response  to  visual  or  audial  cues,  these 
can  be  presented  to  examinees  on  television  through  use  of  video-tapes. 

Task  fidelity  is  thereby  greatly  enhanced.  All  examinees  see  and  hear 
a  situation  very  much  as  if  they  were  on  the  job.  As  they  all  see  the 
same  thing,  the  test  developer  no  longer  has  to  worry  about  a  "word-picture" 
meaning  different  things  to  different  examinees.  Perhaps  the  greatest  benefit 
of  this  testing  mode  is  the  requirement  for  a  real  time  decision.  To 
paraphrase  Omar  Khayyam,  the  moving  picture  moves  on  and  only  the  test 
monitor  can  move  it  back.  The  examinee  must  make  a  decision  quickly; 
in  but  a  few  seconds  another  problem  will  be  presented  for  a  decision, 
very  much  like  in  real  life. 

The  1977  SQf  for  military  police  uses  video-tape  to  test  five  tasks. 

Perhaps  the  most  obvious  candidate  for  a  video-tape  test  is  the  task  of  en¬ 
forcing  traffic  regulations.  Task  cues  are  entirely  visual  and  dynamic. 

Written  description  of  possible  traffic  violations,  even  when  augmented 
by  pictures  or  illustrations,  cannot  capture  the  task  very  well.  The  video-tape 
can  do  much  better.  The  examinee  is  told  to  imagine  himself,  or  herself, 
behind  the  wheel  of  a  patrol  car.  The  camera  is  to  be  the  eyes  of  the 
examinee.  The  test  then  presents  ten  traffic  scenes.  For  each,  the  examinee 
must  decide  whether  a  violation  has  occurred, and,  If  so,  just  what  It  is. 

We  think  this  test  comes  very  close  to  real  world  performance.  It  certainly 
has  greater  task  fidelity  than  a  paper  and  pencil  test,  while  avoiding  the 
obvious  problems  of  administering  a  fully  hands-on  test. 
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Another  task  In  our  video-tape  test  Is  use  of  observation  and  description 
techniques.  The  examinee  Is  given  descriptions  of  several  people  and  vehicles. 
Again,  the  camera  serves  as  the  eyes  of  the  examinee.  The  test  then  shows  people 
milling  around  a  building,  going  In,  or  coming  out.  The  examinee  has  to  select 
those  that  match  the  descriptions.  For  vehicle  Identification  the  examinee 
Is  'driven"  through  a  parking  lot  and  forced  to  pick  out  those  cars  that  match 
the  descriptions.  Although  this  task  could  be  tested  In  a  paper  and  pencil 
mode  If  adequate  pictures  were  given,  there  would  be  some  loss  of  task  fidelity* 
The  video-tape  presents  a  dynamic  situation  and  forces  a  real  time  decision. 

We  are  also  using  video-tape  to  test  the  tasks  of  warning  suspects  of 
their  rights  and  receiving  and  processing  offenders.  The  rights  warning 
1$  not  as  simple  as  It  seems.  The  military  police  must  know  when  the  warning 
Is  necessary;  this  Is  especially  difficult  In  Interviewing  a  witness  who 
says  something  that  may  make  the  police  think  they  have  a  suspect.  There 
are  also  problems  Involving  civilian  as  opposed  to  military  suspects  and 
with  determining  when  legal  counsel  Is  necessary.  Our  video-tape  presents 
five  vignettes.  After  each,  the  examinee  must  determine  what.  If  anything, 
was  done  wrong.  The  same  thing  Is  done  with  the  receive  and  process 
offenders  task.  In  these  tests  the  video-tape  presents  the  audlal  cues  of 
the  task  as  well  as  the  visual  ones.  A  paper  and  pencil  test  could  not  do 
this. 


The  other  task  In  our  video-tape  test  Involves  recording  data  In  the  HP 
notebook.  This  Is  extremely  critical  as  the  notebook  Is  the  basis  of  subsequent 
reports,  and  adequate  notes  could  be  critical  to  the  outcome  of  a  court  case. 

In  this  task  both  visual  and  audlal  cues  are  laportant.  The  examinee  Is  told 
that  he  and  his  partner  are  Investigating  a  crime.  The  examinee  Is  to  record 
the  results  of  an  Interview  In  the  notebook  and  then  walk  around  the  crime  scene, 
noting  possible  clues  and  evidence.  The  cairora  Is  the  eyes  of  the  military 
police. 

This  task  presented  two  technical  problems.  In  real  life  the  military 
police  can  ask  a  question  over  if  the  answer  Is  not  clear.  The  test  cannot 
allow  this,  but  we  do  have  the  military  police  repeat  the  Information  as  It 
is  being  noted.  We  will  be  looking  at  this  carefully  to  see  If  this  serves 
to  overcue  the  examinee.  Another  problem  was  when  to  ask  questions  about  the 
task.  We  wanted  to  Insure  the  examinee  used  the  notes,  not  just  his  or  her 
memory.  To  force  this  the  note  taking  part  of  the  test  appears  at  the 
very  beginning  of  the  video-tape  test.  The  examinee  then  takes  the  other 
parts  of  the  test.  At  the  end  of  this,  about  45  minutes  later,  the 
examinee  must  answer  questions  about  the  notes.  We  think  that  the  Inter¬ 
vening  tasks  will  erase  or  confuse  the  examinee's  memory,  thus  forcing 
reliance  on  the  notes. 
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(Excerpts  of  the  video  tape  test  Mill  be  shown  here). 

the  Military  police  video-tape  SQT  was  an  "In-house"  Amy  project. 

It  Involved  no  contract.  Ho  purchase  of  equipment  was  required.  There  was 
minimal  TOY  expense,  as  the  Anqy  has  complete  video-taping  facilities  at 
Redstone  Arsenal,  only  100  miles  from  the  HP  School  at  Fort  McClellan. 

The  tests  were  written  by  noncommissioned  officers,  subject  matter  experts  In 
the  tasks.  They  were  assisted  by  test  development  and  video-tape  production 
personnel  at  the  two  posts  Involved.  All  acting  was  done  by  Army  personnel. 

In  short,  this  sort  of  test  may  be  more  difficult  to  develop  than  a 
paper  and  pencil  test,  but  It  Is  certainly  not  beyond  the  capabilities  of 
the  average  service  school.  The  MP  School  plans  to  make  heavier  use  of  video¬ 
tape  In  SQT  for  1978.  Me  will  use  this  node  for  testing  our  criminal 
Investigators  and  corrections  specialists;  we  will  also  expand  Its  use 
for  the  military  police. 

This  testing  node  may  also  prove  useful  for  testing  resident  students. 

It  can  present  the  student  with  a  more  realistic  problem  than  a  paper  and 
pencil  test.  Furthermore,  It's  a  lot  more  fun  to  develop.  Mhat  more  can 
one  ask? 


In  this  presentation,  I  will  describe  soae  properties  of  SQT  so 
as  to  convey  a  sense  of  the  rationale  of  skill  qualification  testing, 
and  within  the  content  of  this  rationale  analyse  early  SQT  results. 

I  will  also  allude  to  soae  of  our  experiences  in  iapleaenting  the  SQT 
prograa. 

A  basic  concept  of  the  SQT  approach  to  occupational  proficiency 
testing  is  Job-task.  The  prlaary  nln  is  to  test  the  ability  of  soldiers 
to  perform  Jcb-tacks  which  have  been  identified  as  critical  to  their  MOS. 
At  the  risk  of  shunting  aside  n  plethora  of  difficult  questions,  1  aa 
going  to  assume  that  th*re  are  no  serious  problems  in  identifying  and 
aeanlnfully  defining  job-tasks  or  in  determining  task  criticality. 


The  basic  aodule  of  the  SQT  is  the  ncorable  unit  (SU) .  A  task 
usually  is  represented  on  the  SQT  by  one  SU,  although  complex  tasks 
might  be  represented  by  two  or  three  SU.  The  SU  confronts  the  soldier 
with  a  task,  or  a  series  of  .subtasks  which  more  or  less  faithfully 
capture  the  essential  features  of  the  job  task  which  it  represents. 

The  SU  might  be  regarded  as  a  sub-test,  and  the  SQT  as  an  aggregate 
of  these  subtests.  The  SU  provides  the  basis  for  categorizing  soldiers 
as  "performers"  or  "non-perforners"  of  the  referent  task.  Soldiers 
are  scored  "GO"  or  "NO-GO"  and  receive  a  score  of  "1"  or  "0"  on  each 
scorable  unit.  The  total  SQT  score  is  derived  from  an  aggregation 
of  these  SU  unit  scores;  i.e.,  the  SQT  score  reflects  the  proportion  of 
SU  in  which  the  soldiers  scored  "GO".  Thus,  the  SQT  scores  run  from 
a  alnlaua  of  0  to  a  maximum  of  100. 


For  the  individual  soldier,  the  SQT  score  yields  one  o(  three 
possible  outcone:  (l)  failure,  (2)  verification  (or  lower  passing 
scores)  and  (3)  qualification  (or  higher  passing  scores).  The  higher¬ 
passing  score  is  80  ovr  higher  and  Is  taken  to  indicate  that  the  soldier 
is  technically  qualified  for  award  of  the  skill-level  next  above*  the 
one  presently  held.  The  lower-passing  score  is  60-79,  and  is  taken  as 
indicative  that  the  soldier  is  technically  competent  in  hit  present 
skill-level.  The  falling  score  is  $9  or  less  and  is  taken  to  indicate 
that  the  soldier  lacks  the  required  technical  competence  at  his  present 
skill- level. 


These  SQT  scores  are  highly  significant  for  the  soldier's  career 
On  the  one  hand,  a  qualification  score  on  the  SQT  is  prerequisite 
to  award  of  the  next  higher  skill  level  and  to  the  soldier's  eligibility 
to  compete  for  promotion.  On  the  other  hand,  after  two  successive 
SQT  failures,  the  soldier  is  vulnerable  to  adverse  personnel  actions. 
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Responsibility  for  implementing  the  SQT  program  has  been  assigned 
to  the  Army  Training  and  Doctrine  Command  (TRADOC) .  This  i s  not 
fortuitous.  The  SQT  is  tracing-oriented.  The  quea,t  for  training-  , 
relevance  dictates  much  of  the  logic  of  the  system.  In  selecting  tasks 
for  testing,  the  aim  is  to  select  those  tasks  which  need  to  be  ma4«  the 
focus  of  training.  In  reporting  SQT  results,  the  aim  is"  to  provide  info¬ 
rmation  for  managing  and  directing  training.  But  at  the  same  time,  SQT 
scores  provide  a  highly  important  input  for  the  personnel  management 
system. 

The  import  here  is  that  the  SQT  must  serve  two  masters:  Training 
management  and  personnel  management.  This  can  impose  divergent 
Informational,  requirements  on  the  SQT.  Resolving  the  competing,  demands 
of  the  trainers  and  personnel  managers  has  been  one  of  the  problem 
areas  not  easy  of  satisfactory  resolution.  But  if  this  complicates 
life  for  us,  it  seems  a  necessary  complication.  If  the  SQT  is  to  provide 
the  leverage  for  focusing  and  directing  Individual  training  efforts, 
then  the  SQT  score  must  hay*  '■areer-relevancc.  It  must  make  a 
difference  in  terms  of  the  soldier's  movement  up  the  career-ladder. 

SHAKEDOWN  TESTING 

Prior  to  testing  for  record,  which  began  last  April,  SQT  were 
administered  on  a  trial  basia  to  soldiers  in  four  MOS.  The  purpose 
was  to  shakedown  the  SQT  system.  The  scores  were  not  entered  into  the 
records  of  the  soldiers  tested.  The  results  of  the  shakedown  testing 
will  not  be  reported  in  detail  here.  But,  to  summarize  briefly,  of  the 
soldiera  tested,  76  percent  failed,  23  percent  verified  their  present 
skill  level,  and  less  than  one  percent  qualified.  There  may  be  umny 
reasons  for  these  predominantly  low  scores,  but  I  assume  that  to  some 
extent  they  reflect  the  fact  tha».  the  testing  was  not  for  record. 

Promotion,  classification,  or  retention  in  the  Army  were  not  at  stake 
for  the  soldier. 

Of  the  lessons  learned  from  the  shakedown  testing,  let  me  cite  three. 
1)  Hsnds-on  performance  testing.  Although  data  on  soldier  reactions  to 
the  SQT  were  not  systematically  collected,  there  was  quite  a  lot  of 
informal  feedback  which  indicated  that  the  soldiers  reacted  very  positively 
to  the  hands-on  component.  But  at  the  seme  time,  we  became  newly  aware 
of  the  Just  how  demanding  it  can  be  to  develop  even  a  relatively  simple 
performance  test.  Me  spent  long  hours  with  the  SQT  developers  in  the 
test  development  agencies  in  specifying  performance  measures  in  the 
detail  and  clarity  required  for  uniform  and  objective  scoring.  Also 
the  formulation  of  instructions  on  setting  up  the  test  station  and  admin¬ 
istering  and  scoring  the  test  was  difficult  and  time  consuming.  But, 
nonetheless,  one  of  our  major  aims  is  to  expand  the  HOC,  2)  Logistics, 
Another  problem  about  which  the  ahakedovn  made  us  wiser,  or  at  least  more 
sensitive,  had  to  do  with  the  logistics  of  supporting  hands-on  performance 


testing.  For  cxwiple,  if  soldiers  are  to  be  tested  on  the  ability  to 
toss  hand-grenades  then  it  is  necessary  that  "practice"  hand-grenade  be 
available  not  only  for  testing,  but  for  training..  In  our  guidance  to 
the  schools,  where  the  SQT  are  developed,  we  have  become  emphatic  on 
this  point,  i.e.,  be  sure  that  provision  is  made  to  insure  the  avail¬ 
ability  of  supplies  and  equipment  required  in  the  hands-on  component. 

Be  very  cautious  in  your  assumptions  on  availability  of  equipment.  3) 

A  third  lesson  had  to  do  with  errors  in  filling  out  SQT  answer  sheets. 

The  SQT  has  throe  coaponehts.  There  is  a  separate  mark-sense  answer 
sheet  for  each  component.  Whatever  else  might  be  said  about  this  set-up, 
it  seems  to  be  a  sensitive  and  powerful  test  of  the  soldiers'  clerical 
ability.  Almost  100  percent  of  the  SQT  answer  sheets  returned  to  1TED 
for  scoring  contained  at  least  one  clerical  error.  It  is  a  profound 
tribute  to  the  Field  Services  Staff  and  the  Date  Processing  folks  at 
ITED  that  virtually  all  these  answers  sheets  were  finally  scored.  We 
have  succeeded  in  significantly  reducing  this  error  rate.  But  forma 
-*■  siaqilificatiou  remains  one  uf  our  major  alias. 

TESTING  FOR  RECORD 

Testing  for  record  began  last  April  with  the  testing  of  soldiers  In 
CMF  11  (Maneuver  Combat  Arms).  SQT  2,  3,  and  4  were  administered  to 
soldiers  in  the  following  MOS:  11B  (Infantryman),  11C  (Indirect  Fire 
Infantryman),  UD  (Armor  Reconnaissance  Specialist),  HE  (Armor  Crewman). 

Before  presenting  and  discussing  resultf  from  this  first  round  of 
testing.  It  is  necessary  to  turn  aside  briefly  to  clarify  two  points: 

(1)  the  scheme  for  numbering  SQT,  and  (2)  the  use  of  tracks  in  SQT. 

First  let  mb  comment  about  the  numbering  of  SQT. 

-J?b4?rlng  Pf  SW*  The  ««■*>«*■  of  *n  SQT  reflects  the  of 

soldiers  who  take  It.  There  arc  five  skill-levels  In  the  Army  enlisted 

MOS  structure  and  these  are  articulated  with  pay  grade  as  shorn  In  slide  1. 

(Slide  One) 

Also  shown  here  is  the  relationship  between  skill-level  and  SQT  number. 

SQT  2  matches  skill-level  1,  SQT  3  matches  skill-level  2,  SQT  4  matches 
skill -level  3,  and  SQT  5  matches  skill-level  4  and  S. 

The  logic  in  having  the  SQT  one  number  higher  than  ahe  skill- 
level  is  that  the  soldier  takes  the  SQT  as  a  means  of  qualifying  for 
the  next  higher  skill  level.  For  example,  skill-level  l  sv idlers 
take  SQT  2  as  a  means  of  qualifying  for  skill-level  2  and  skit! -level 
2  soldiers  take  SQT  3  as  a  means  of  qualifying  for  skill-level  3.  The 
logic  of  this  system  of  numbering  SQT  is  the  same  as  the  logic  for 
calling  them  skill-qualification  tests. 

SQT  Tracks.  Now  with  the  SQT  numbering  scheme  hopefully  clarified,  let 
me  comment  about  SQT  tracks,  It  is  not  unusual  that  soldiers  holding 
the  same  MOS  and  assigned  to  duty  positions  in  that  HOS  actually  work 
with  different  types  of  equipment  or  perform  different  types  of  duties, 
lor  example,  soldiers  in  a  given  MOS  may  work  either  with  equipasnt  A 
or  equipment  B,  but  virtually  never  with  both  A  and  B. 
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This  pose*  a  problem  in  proficiency  testing.  Should  soldiers  who 
work  only  with  equipment  A  be  held  accountable  for  proficiency  in 
equipment  B,  and  vice-versa?  Consisted  with  the  ain  of  testing  only 
those  competencies  which  are  directly  relevant  to  the  soldier's  job,  the 
resolution  of  this  question  in  the  SQT  program  has  been  a  policy  decision 
to  allow  MOS  proponent  agencies  to  track  SQT:  i.e. ,  to  have  the  SQT 
made  up  of  a  core  of  tasks  on  which  all  soldiers  in  the  MOS  are  tested 
plus  two  or  more  parallel  segments,  each  made  up  of  tasks  pertinent  only 
to  certain  equipment  systems  and/or  duty  positions.  The  proponent 
agency  decides  on  the  number  of  tracks  and  specifies  the  rules  for  assigning 
soldiers  to  tracks. 

Infantry  Results.  N'v,  after  this  digression  to  talk  about  SQT  numbering 
and  the  policy  on  cracking  SQT,  let  me  return  to  the  main  thrust  of  the 
presentation  and  discues  some  results  of  the  first  round  of  skill 
...  qualification. testing.  The  presentation  here  is  iimited  to  MOS  118 
(Infantryman)  and  MOS  11C  (Indirect  Fire  Infantryman).  SQT  results 
for  soldiers  in  these  two  MOS  are  summarized  in  Slide  two. 

(Slide  Two) 

The  slide  shows  -  by  skill  level  -  the  percentages  of 
soldiers  who  (1)  qualified  for  the  next  higher  skill  level  (score 
"GO"  on  80  percent  or  more  of  the  tasks  tested),  (2)  verified  thn 
present  skill  level  (scored  "GO"  on  60  percent  to  80  percent  of  the 
tasks),  and  (3)  failed  to  verify  the  present  skill  level  (scored  "GO" 
on  less  than  60  percent  of  the  tasks  tested) . 

As  shown  here,  the  11C  soldiers  scored  lower  than  11B  soldiers, 
with  the  11B  vs  11C  differences  being  biggest  at  skill  level  1  (SQT  2) 
and  smallest  at  skill  level  3  (SQT  4).  My  major  purpose  now  Is  to 
examine  the  meaning  of  these  differences,  limiting  the  analysis  to 
SQT  2  of  MOS  11B  and  11C. 

It  is  pertinent  here,  to  note  that  the  11C  test  had  two  tracks: 

Track  1  for  soldiers  who  work  with  the  81  MM  mortar  and  Track  2  for 
soldiers  who  work  with  the  107  MM  (4.2  inch)  mortar.  As  s  group( 
soldiers  taking  Track  l  consistently  scored  slightly  higher  than 
soldiers  taking  Track  2.  This  superior  performance  of  soldiers 
working  with  the  81  MH  mortar  was  reflected  in  higher  pass  rates  on 
tasks  in  the  non-tracked  portions  of  the  SQT  as  well  as  the  tracked 
portions.  In  comparing  11B  and  11  C  SQT  performances,  the  11C  scores 
will  be  from  soldiers  taking  Track  1  (i.e.,  soldiers  working  with 
the  81  MM  mortar).  The  differences  within  MOS  11C  between  soldiers 
working  with  the  81  MM  mortar  and  soldiers  working  with  the  107  MM 
mortar  is  a  separate  issue. 
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It  is  consonant  with  the  logic  of  skill  qualification  testing  to 
say  that,  in  the  absence  of  plausible  competing  interpretations,  low 
SQT  scores  are  indicative  of  the  need  for  training.  The  implication 
here  then  is  that  MOS  11C  soldiers  are  not  as  well  trained  as  MOS  HE 
soldiers  in  the  duties  of  their  respective  HOS,  But  there  are  competing 
interpretations  which  need  to  be  examined;  namely,  interpretations  in 
terms  of  test-practice  effects  and  of  difficulty. 

Test  Practice  Effects.  The  role  of  test  practice  as  a  credible  inter¬ 
pretation  is  based  upon  the  fact  that  SQT  2  and  3  of  MOS  11B  were 
administered  Army-vido  last  year  as  part  uf  the  shakedown  testing.  Thus, 
many  of  the  11B  soldiers  were  taking  the  SQT  for  Che  second  time.  This 
was  not  true  of  11C  soldiers. 

In  connection  with  the  question  of  test  practice  effects,  it  is 
of  Interest  to  examine  SQT  shakedown  scores  in  relation  to  the  "For 
•••  record"  SQT  scores.  The  scores  are  summarised  in  Slide  Three. 

(Slide  Three) 

As  shown  here,  the  1 IB  soldiers  did  considerably  better  the  second  f 
time  around.  No  soldier  earned  qualifying  scores  in  the  shakedown  as 
compared  with  22  percent  of  the  soldiers  qualifying  in  the  record  testing* 
In  the  shakedown  testing,  the  failure  rate  was  82  percent  versus  a  failure 
of  31  percent  in  the  testing  for  record. 

Although  consistent  with  a  "practice-effects"  Interpretation,  it  is 
not  clear  that  these  differences  really  reflect  test  practice  effects. 
Since  the  "for  record"  SQT  scores  had  career-relevance  for  the  soldier 
which  the  shakedown  scoret.  lacked.  It  is  plausible  to  attribute  the 
higher  scores  ir.  tne  "for  record"  testing  to  stronger  motivation  to  do 
well  on  the  test.  It  may  be  wore  relevant  here  to  compare  11B  and  11C 
scores  on  those  tasks  which  are  common  to  the  two  SQT.  First,  let  us 
examine  the  Performance  Certification  Component  (PCC)  and  the  Hands-on 
Component  (HOC) . 

On  both  the  11B  and  1 1 C  SQT,  the  PCC  included  an  arms  qualification 
test  (Qualify  wit  M16A1  Rifle)  and  the  Advance  Physical  Fitness  Test 
(APFT) . 

(Slide  Four) 

The  rifle  qualification  test  is  summarised  in  Slide  Four.  The  soldier 
was  scored  as  follows: 
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Failure  to  Qualify 

Marksman 


0  Scorable  Unit 
1  Scorabie  Unit 


Sharpshooter 

Expert 


2  Scorable  Units' 

3  Scorab’a  Units 


(Slide  Five) 


The  AFFT  is  sumarixed  in  Slide  Five.  It  was  scored  as  follows: 


Failure  to  Qualify 
300-399  Points 


0  Scorable  Unit 


1  Scorable  Unit 


400-449  Points 


2  Scorable  Units 


450-500  Points 


3  Scorable  Units 


(Slid*  Six) 

Scores  on  the  arms  qualification  test  are  summarized  in  Slide  Six. 

(Slide  Seven) 

Scores  on  the  Advanced  Physical  Fitness  test  are  suaawrlxed  in  Slide 
Seven.  H0S  US  and  1 1C  soldiers  performed  comparably  on  these  tasks. 

(Slide  Eight) 

In  addition  to  the  rifle  qualification  and  physical  fitness  tests, 
the  11C  PCC  included  a  Gunner's  Exam  on  which  soldiers  were  scored  as 
unqualified  with  no  scorable  units,  or  as  second  class k  first  class, 
or  expert  gunner  and  credited  with  l,  2,  or  3  scorable  units.  The  scores 
are  summarized  in  Slide  Eight.  The  percentages  of  soldiers  qualifying 
at  these  three  levels  was  lower  here,  but  mainly  reflected  a  higher 
percentage  not  rated  (i.e.,  more  soldiers  who  had  not  taken  the  Gunner's 
exam) . 

(Slide  Nine) 

The  hands-on  component  Included  six  tssks,  five  of  which  were  common 
to  the  two  SQT.  Ine  HOC  results  are  summarized  in  Slide  Nine.  The  two 
groups  performed  very  comparably  on  the  five  common  tasks.  On  the  sixth 
task,  which  was  unique  to  each  SQT,  69  percent  of  the  11B  soldiers  scored 
"GO"  and  62  percent  of  the  11C  soldiers. 
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The  result*  of  performance  on  the  PCC  and  HOC  do  not  give  evidence  - 
oJT  practice  effects.  On  those  PCC  and  HOC  tasks  cow»n  to  the  two  SQT, 
the  11B  soldiers  did  no  better  than  11C  soldiers.  But  it  is  of  interest 
here,  to  point  out  that  the  SQT  Notice  which  soldiers  get  at  least  £0 
days  prior  to  SQT,  gives  very  detailed  and  complete  information  on  the 
PCC  and  HOC.  Soldiers  couid  get  a  comparable  level  of  information  about 
the  WC  only  by  actually  seeing  the  test  booklet.  Thus,  it  would  seen 
reasonable  to  expect  that  whatever  advantage  might  have  accrued  to  11B 
soldiers  for  having  taken  the  SQT  before  would  not  be  as  strongly  reflected 
in  performance  on  the  PCC  or  HOC  as  on  the  UC. 

(Slide  ken) 

Of  35  scorable  units  in  the  UC  of  these  SQT,  15  were  common  to  the 
two  SQT.  The  11B  and  11C  pass  rates  on  these  15  tasks  are  shown  in 
Slid''  Ten. 

NOS  IIP  pass  rates  were  higher  on  eight  and  lower  on  six  of  these 
tasks.  Pass  rates  were  equal  on  one  task.  The  differences  were  generally 
modest-  with  the  most  notable  exception  occuring  on  Task  071-11A-1501 
(call  for/Adjust  Indirect  Fire,  using  Grid  Coordinate  Method  of  Target 
Location  and  Bracketing  Method  of  Adjustment).  Thirty-five  percent  of 
11B  soldiers  scored  "00"  on  this  task  as  compared  to  53  percent  of 
11C  soldiers. 

NOS  11B  soldiers  did  not  perfotm  better  fcha  1IC  soldier  on  the  common 
written  SU.  Thus,  the  data  here  do  not  support  the  argument  that  11B 
SQT  scores  were  higher  than  i 1C  because  of  test  practice  effects.  What 
the  data  do  indicate  is  that  11C  soldiers  most  frequently  failed  thoue 
tasks  urique  to  the  11C  SQT. 


*  '  *  -  -  «*»■*»».  w~n*,  -4n  *W  Hi^K  <—>« 


r.- 


DIFFICULTY.  The.three  psychologists  at  ITED  who  are  moat  familiar  with  SQT 
1182  and  SQT  11C2  agreed  in  the  judgement  that  the  11C  teat  confronts 
the  soldier  with  questions  which  tend  to  be  cognitively  wore,  difficult 
than  11B  questions.  In  order  to  put  these  impressions  to  a  More 
critical  .test,  the  two  SQT  were  systematically  examined -with  respect  to  , 
readability  and  content. 

In  assessing  readability,  it  is  recognised  that  conventional 
readability  formulas  such  as  the  fog-count  probably  are  not  fuller 
applicable  to  the  SQT.  These  formulas  utilise  sentence  length  and  word 
length  as  indicators  of  readability.  In  a  document  which  uses  tech¬ 
nical  terms  familiar  to  the  audience,  word  length  may  ceaae  to  be  effective 
qs  a  readability  index.  For  example,  the  use  of  "camouflage*  in  «n  SQT 
probably  does  not  have  the  significance  that  a  word  of  this  length  would 
have  in  nontechnical  writing.  However,  it  seema  reasonable  to  assume 
that  readability  analysis  is  no  leas  applicable  to  one  of  these  SQT  than 
the  other.  Thus.,  differential  readability  scores  derived  from  the  tests 
presumably  would  be  indicative  of  actual  differences  in  readability. 

TVo  different  readability  formulas  were  epplied.  The  fog-count  estluated 
the  readability  of  both  SQT  to  be  at  about  the  seventh  grade  level.  A 
readability  formula  proposed  by  Fry  lelded  slightly  different  estimates; 
about  the  si,  :h  grade  level  for  SQT  11B2  and  one  grade  higher  for  SQT  11C2. 
The  difference  Sere  reflects  sentence  length.  The  situational  descriptions 
and  item  stems  included  slightly  longer  sentences  in  SQT  11C2  than  in 
SQT  11B2. 

Along  the  same  line.  It  is  of  interest  that  the  reading  load  as 
reflected  in  the  total  number  of  words  in  the  situational  descriptions 
and  item  stems  is  greater  by  about  15  percent  in  SQT  11C2  than  in  SQT 
11B2.  The  11C2  written  scorable  units  on  the  average  are  slightly  longer 
in  terms  of  number  of  items.  Of  the  20  scorable  units  which  are  unique 
to  the  11B  SQT,  one-half  have  four  or  less  items.  Of  the  20  SU  unique  to 
the  11C  SQT,  one-ha’ t  have  six  or  more  items. 

In  summary,  then,  there  are  indications  that  SQT  11C2  may  impose 
(.lightly  higher  demands  on  the  soldier  in  terms  of  reading  burden. 

In  relation  fo  the  question  of  test  difficulty,  it  is  relevant  also 
to  ask  whether  the  11B  and  11C  SQT  are  different  in  terms  of  the  kinds 
of  tasks  required  of  the  soldier?  The  distinction  between  "written 


performance"  and  "performance  based"  casting  is  relevant  here.  ITED 
guidance  on  SQT  development  characterises  written  performance  (WP) 

testing  as  that  which  requires  the  soldier  to  perform  a  task  (or  task  j 

segment)  essentially  as  it  would  be  performed  on  the  job;  whereas  j 

performance  baaed  <PB)  testing  requires  the  soldier  to  answer  questions 

about  task  performance.  TVo  ITED  psychologists  analysed  SQT  11B2  j 

and  SQT  11C2  in  terms  of  WP  and  PB  testing,  and  categorised  test  items  j 

as  WP  or  PB  and  scorable  units  as  WP,  PB,  or  mixed.  This  analysis  \ 

revealed  no  differences  between  SQT  11B2  and  SQT  11C2.  Of  the  20  j 

written  SU  unique  to  the  11B  SQT,  10  were  identified  as  WP  or  mixed.  j 

Of  the  20  written  SU  unique  to  the  11C  SQT,  11  were  identified  as  WP 

or  mixed.  Thus,  the  two  SQT  were  very  comparable  in  regard  to 

utilisation  of  WP  and  PB  testing. 

In  another  analysis,  we  grouped  SU  into  the  following  categories 
based  upon  the  types  of  behavior  required  of  the  soldier. 


Worn  recognition 


Picture  recognition 
Chart  reading 
Mathematical  computation 

It  was  assumed  that  the  SU  in  the  first  two  categories  are  generally 
easier  than  SU  in  the  last  two  categories,  and  pass  rates  are  consistent 
this  assumption,  i.e.,  lower  on  SU  involving  chart  reading  and  mathematical 
computation  than  on  SU  requiring  word  or  picture  recognition.  Of  the 
20  written  SU  unique  to  the  ilB  SQT,  only  one  involvled  mathemaltcal 
computation  or  chart  reading.  Of  the  20  written  SU  unique  to  the  11C 
SQT,  nine  Involved  mathematical  computation  and/or  chart  reading. 

This  analysis  suggests  that  the  11C  SQT  may  have  been  a  more  difficult 
test  than  the  11B  SQT.  It  is  pertinent  to  ask  whether  this  Is  spurious 
or  is  reflective  of  "real  world"  differences  between  these  two  M0S.  Are 
tasks  which  11  soldiers  are  required  to  perform  cognitively  more  complex 
than  tasks  l IB  soldiers  are  required  to  perform?  This  question  cannot 
be  answered  from  the  present  data,  but  it  is  of  interest  that  the  more 
difficult  tasks  on  the  SQT  were  those  unique  to  MOS  11C.  These  tasks 
hsd  to  do  mainly  with  gunnery  operations. 
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Now  let  m  briefly  rvsmarire.  SQT  scores  for  two  NOS  were  compared 
sad  found  to  be  quite  disparate.  As  compared  to  their  counterparts  in 
NOS  11B,  soldiers  in  HQS  11C  scored  substantially  lower.  It  is  consistent 
with  the  logic  of  skill  qualification  testing  to  interpret  these  results 
as  Indicating  differential  need  for  training.  Competing  interpretations 
have  been  examined.  The  analysis  argues  against  attributing  the  differences 
to  test  practice  effects.  There  arc  indications  that  the  11C  SQT  may 
have  been  more  difficult*  but  not  necessarily  in  a  spurious  way.  The 
11C  soldiers  had  their  highest  failure  rates  on  tasks  specific  to  the 
MOS.  These  tasks  had  to  do  mainly  with  mortar  operations. 

In  concluding*  s  aajor  aim  in  the  SQT  program  is  the  identification 
of  performance  deficiencies  which  need  to  be  made  the  focus  of  training. 

The  present  analysis  provides  an  early  indication  of  the  realisation 
of  this  ala. 
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PAY  GRADE,  SKILL  LEVEL,  AND  SQT  NUMBER 
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13,  Engage  Targets  With  an  M203  Grenade  Launcher  & 
?5?iYlIriS6^E  ^cnoN  T0  "EDUCE  A  Stoppage 

14,  a^Ca^i blr  ,45  Pistol  &  Ammunition 

15,  Perform  an  ESC  Inspection  on  a  Wheeled 
Vehicle  (07I-11A-G005) 

I 

SLIDE  10 


--  *  .  ,y  ;  . 

£\/A  V 


PERCENT  GO 

11B _ 11C 

53% 

51% 

93 

92 

83 
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53 

47 

77 
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61 

58 

53 

51 

36 

87 
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35 

35 

53 
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54 

51 

50 

56 

48 

*16 

53 
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PERFORMANCE  TESTING  IN  THE  SOT 

Or  Frank  M.  Av«r sano 
Mr  David  H.  Pool* 

INTROUUCTIOK 


As  Indicated  already,  the  SQT  may  consist  of  three  parts;  A  written 
component  (VC),  a  hands-on  component  (HOC)  and  a  performance  certification 
component  (PCC).  The  written  component  is  «  paper  and  pencil  test. 

'rhe  hands-on  test,  as  its  name  suggest,  tests  the  soldier  by  observing 
hit/her  performance  on  job  equipment  or  simulators.  The  performance 
certification  component  is  a  performance  test  given  on  the  job.  The 
PCC  is  used  when  testing  a  task  in  the  HOC  is  impractical  or  requires 
too  much  time,  equipment  or  other  resources.  The  focus  of  this  paper 
is  the  hands-on  component. 

In  the  SQT  program  we  have  proceeded  tsomewhat  cautiously  in  regard 
•  tc  performance  testing,  and  so  far  the  hands-on  component  has  been 
relatively  small.  But  this  Is  changing.  Eventually,  the  SQT  in  some 
MOS  may  be  made  up  mostly  of  hands-on  testing. 

But  the  first  steps  into  Army-wide  performance  testing  have  been 
taken.  This  is  an  initial  and  somewhat  limited  progress  report  on  our 
early  experiences.  In  no  way  should  the  pnpoi  be  considered  as  the 
last  word  on  performance  testing  in  the  SQT  program,  or  even  the  first 
word  in  any  official  sense. 

In  organising  our  comments  here,  we  draw  upon  Fitzpatrick  and 
Morrison's  chapter  on  performance  and  product  evaluation  in  Educational 
Measurement  (1971)  edited  by  Thordike.  These  writers  discuss  performance 
testing  in  terms  of  stimulus  aspects  snd  response  aspects.  The  HOC 
will  be  discussed  here  In  terms  of  test  instructions,  stimuli  presented 
to  the  examinee,  stimuli  surrounding  the  examinee,  response  aspects,  and 
test  administration.  Vc  will  also  present  an  example  of  a  handc-on 
score  sheet  and  touch  upon  the  topics  of  test  reliability  and  validity. 


INSTRUCTIONS 


All  hands-on  tests  have  extensive  instructions  which  must  be  written 
for  four  persons  who  are  involved  in  the  administration  of  the  hands-on 
test.  These  persons  are  the  Test  Control  Officer  (TCO),  Test  Site 
Managers  (TSM),  scorer  and  examinee*  The  Test  Control  Officer  is  the 
individual  responsible  for. the  administration  of  SQT  in  the  field.  He/she 
is  the  point  of  contact  between  the  field  and  1TED.  It  is  important  that 
instructions  be  written  in  sufficient  detail  to  allow  the  TCO  to  organise 
for  the  administration  of  the  HOC.  This  places  a  burden  on  test  developers 
to  consider  all  aspects  of  the  testing  situation. 

'V 

The  Test  Site  Manager  (TSM)  is  appointed  by  the  TCO  to  supervise 
the  administration  of  the  HOC.  The  TSM  is  responsible  for  the  equip- 
.  ment  used  in  the  test,  the  sot-up  of  the  test  site,  the  rotation,  of 
soldiers  through  the  test  site  and  the  training  of  scorers. 

Scorers  arc  those  individuals  who  actually  administer  the  HOC  and 
score  soldiers*  performance. 

The  importance  of  explicit  instructions,  training  and  rehearsal  for 
the  scorers  cannot  be  overemphasized.  In  some  cases  TV  tapes  have  been 
developed  to  show  the  scorers  how  to  score  the  HOC.  Examples  of  both 
correct  and  incorrect  performance  are  demonstrated  with  concurrent 
marking  of  the  score-sheet.  In  addition,  the  scorers  are  required  to 
practice  scoring  using  the  scoresheet.  In  a  simulation  of  the  test, 
scorers  play  the  role  of  examinees  and  intentionally  perform  incorrectly 
while  fellow  scorers  evaluate  their  performance.  All  scorers  must  play 
the  role  of  examinee  and  scorer  before  being  permitted  to  score  on 
the  actual  test. 

The  fourth  and  perhaps  most  important  role  in  the  hands-on  test 
is  that  of  examinee.  The  problems  that  deal  with  examinee  instructions 
tend  to  be  of  two  types.  In  the  first  type  the  instructions  are  so 
vague  that  the  examinee  does  not  know  what  behavior  is  required  tv 
demonstrate  mastery  of  the  task.  In  the  second  type,  inatructio  's  are 
too  detailed.  Instructions  are  too  detailed  when  they  cue  the  sr idler 
to  the  desired  behavior.  For  example,  there  is  one  task  that  requires 
the  soldier  to  fire  the  LAW  (Light  Anti-Tank  Weapon).  One  of  the  perfor¬ 
mance  steps  is  to  inspect  the  LAW  before  firing  to  insure  that  the 
weapon  is  free  of  cracks.  If  the  scorer  tells  the  examinee  to  iuspect 
for  cracks,  he/ she  has  cued  the  soldier  to  correct  behavior.  Excessively 
detailed  and  vague  instructions  can  be  reduced  by  outside  review  and 
trail  runs. 


Instructions  are  also  clarified  for  the  examinee  by  releasing  the 
sccresheut  60  to  90  days  prior  to  the  test  in  the  SQT  notice.  This 
procedure  has  a  number  of  advantages.  In  addition  to  providing  the 
examinee  with  Information  about  the  taaH  to  be  tested,  it  may  motivate 
the  examinee  to  prepare  for  the  test  and  directs  the  soldier's  training 
to  critical  casks.  A  disadvantage  of  this  procedure  is  that  the  soldier 
may  have  a  tendency  to  practice  only  those  steps  that  are  on  the 
scoresheet;  Sometimes  the  scoresheet  may  not  be  inclusive  because 
of  the  difficulty  encountered  in  reliably  observing  some  performance 
step.  In  some  cases  this  has  been  solved  by  putting  the  performance 
step  that  is  difficult  to  observe  on  the  scoresheet  hut  not.  scoring 
the  performance  of  the  step  during  the  test.  In  thii.  way,  training  is 
maintained  as  well  as  reliable  measurement. 

•Stimuli  Presented  to  the  Examinee. 


The  stimuli  presented  to  the  examinee  during  the  test  should  approx¬ 
imate  those  stimuli  found  on  the  Job.  The  stimuli  on  the  hands-on 
test  can  come  in  three  forms:  media,  simulators  and  actual  equipment. 

Each  one  of  these  forms  Is  appropriate  in  different  situations. 

Media,  for  example,  are  appropriate  when  no  direct  intervention  is 
required  on  the  part  of  the  examinee  and  when  the  job  stimuli  are  transi¬ 
tory.  For  example,  a  Hilltary  Policeman  might  only  see  a  suspect  or 
stolen  vehicle  for  a  few  seconds  before  a  decision  to  act  must  be  made. 
Similarly,  a  soldier  may  only  catch  a  brief  glimpse  of  a  tank  or 
aircraft  before  he/she  must  decide  if  it  is  part  of  some  threat  force. 
However,  when  intervention  Is  required  by  the  examinee,  media  usually 
are  not  appropriate . 

Simulators  also  present  a  useful  alternative  for  the  test  developer. 

When  the  use  of  actual  equipment  is  impractical  or  impossible  because 
of  expense,  expendabil ity  of  equipment,  safety  reasons,  and  the 
large  amount  of  soldiers  oeing  tested,  simulators  provide  a  reasonable 
alternative  for  the  test  developer.  One  problem  encountered  with 
simulators  concerns  the  degree  of  realism  in  the  simulator.  Many 
developers  believe  that  simulators  are  an  inadequate  form  of  testing 
because  they  lack  sufficient  realism.  However,  this  is  often  not  the 
case  if  the  task  and  job  analysis  provide  the  necessary  Information  for 
setting-up  realistic  standards  and  conditions,  and  for  creating  a  device 
that  teats  the  task  In  question.  The  major  problem  with  this  approach  is 
that  it  takes  a  good  deal  of  time,  requires  technical  and  educational 
resources,  and  is  mostly.  However,  if  the  device  can  be  used  in  testing 
and  training,  it  becomes  an  important  addition  to  the  soldier's  job  training 

Another  way  in  which  job  stimuli  can  be  presented  is  through  the  use 
of  real  equipment.  Sometimes,  the  only  way  *ob  stimuli  can  be  presented 
ifc  through  the  use  of  real  equipment  as  is  the  case  in  laying  a  road  bed 
or  using  a  ship’s  boom  to  move  cargo.  Obviously,  it  would  be  inefficient 
to  lay  miles  of  road  or  tie  up  a  large  transport  simply  to  test  a  soldier. 
This  problem  has  been  partially  solved  through  the  use  of  the  performance 
certification  component  (PCC)  described  earlier. 
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While  similar  to  the  HOC,  the  PCC  presents  its  own  unique  problems 
and  will  not  be  discussed  here. 

Stimuli  Surrounding  The  Examinee. 

The  stimuli  surrounding  the  examinee  are  the  test  conditions. 

The  most  important  aspect  of  the  test  conditions  in  the  HOC  is  that 
they  be  standardized  for  all  examinees.  Standardization  helps  insure 
that  all  examinees  are  treated  essentially  the  same  and  the  performance 
situation  is  defined  clearly.  Standardization  is  achieved  by  issuing 
instructions  that  specify  what  conditions  must  be  met  before  the  test 
is  administered.  The  instructions  also  supply  the  limits  of  the  test 
conditions.  Conditions  are  especially  important  when  the  test  is 
administered  outdoors.  At  a  minimum,  the  following  environmental 
conditions  should  be  considered:  light,  precipitation,  visibility 
temperature  and  noiae.  if  the  conditions  do  not  fall  within  those 
specified  by  the  teat  developers,  the  soldier  receives  a  "Not  Rated". 

•  Therefore,  conditions  nut-  be  sufficiently  broad  enought  to  permit 
testing  under  many  conditions  and  limited  enough  to  prevent  one 
examinee  ;srom  having  an  unfair  advantage  over  another. 

Response  Aspects. 

Response  aspects  can  be  discussed  in  terms  of  process  or  product 
scoring.  In  process  scoring  the  behavior  of  the  soldier  is  observed 
and  scored.  In  product  scoring,  the  result  of  the  behavior  is  scored. 
We  believe  that  It  is  better  to  use  product  scoring  whenever  possible. 
If  this  is  done  the  scorer  can  take  sufficient  time  to  review  the 
product  and  can  get  the  opinion  of  other  acores.  Product  scoring 
also  provides  a  record  of  the  performance  which  can  be  used  to  monitor 
the  scorers. 

A  p rob leu  that  is  sometimes  encountered  in  scoring  deals  with 
error  tolerance,  This  Is  a  problem  when  perception  is  minute  as 
in  the  sighting  of  a  weapon  or  in  reading  a  measuring  device  that  has 
minute  calibrations.  Scorers  must  be  made  aware  of  this  problem  and 
a  realistic  range  see  in  order  to  reduce  perceptual  errors.  Scorer 
fatigue  and  bordom  is  another  problem  that  must  be  eliminated  if 
accurate  scores  are  to  be  obtained.  Giving  the  scorers  breaks  from 
spring  and  allowing  them  to  score  different  parts  of  the  test  helps 
minimize  this  problem, 

An  Example 


At  this  point,  it  may  be  informative  to  show  a  scoresheet  used  in 
a  hands-on  test.  Figure  1  depicts  the  scoresheet  for  assembling  and 
, reparing  a  tactical  FM  radio.  The  acoresheet  contains  seven  critical 
performance  measures.  The  task  is  product  scored.  All  performance 
measures  are  scored  after  the  soldier  has  completed  the  task.  The 
scorer  does  not  score  the  performance  measures  by  observing  the  soldier 
during  the  process  of  putting  the  radio  together.  In  this  case,  correct 
process  is  determined  by  correct  product.  Essentially,  the  task  can  be 
scored  GO  or  NO-GO  with  performance  measures  five,  six,  and  seven. 


The  other  performance  measures  are  there  for  several  reason*.  First, 
soldi sr*  use  the  schoresheets  In  their  SQT  notices  as  guides  in  preparing 
and  training  for  SQT.  Secondly,  feedback  is  more  specific  and  useful 
when  these  other  performance  measures  are  listed.  Performance  measure 
seven  "did  not  damage  battery"  is  used  because  during  tryouts  of  this 
test,  when  performance  measure  seven  was  not  yet  included,  many  soldidrs 
broke  the  batteries  by  incorrect  assembly  procedures.  Thus,  this 
performance  measure  was  added  to  cover  critical  process  elements  of 
.  the  task. 

The  reference  to  the  notes  in  performance  nteps  6  and  7  refer  to 
special  scoring  instructions  at  the  bottom  of  the  page.  These  notes 
provide  more  detailed  information  about  scoring  steps  6  and  7.  The 
scoresheet  also  provides  a  place  for  recording  any  additional  information 
needed  to  explain  a  NO  GO.  For  example,  if  a  soldier  failed,  the 
scorer  would  write  down  the  reason  for  the  NO  GO. 


PREPARE  (AM/PRC-77)  RADIO  SCORESHEBT  (PREPARE/OPERATE  TACTICAL  PM  RADIOS) 

(AH/PRC-77,  AN/VRC-64 ,  and  AN/GRC-160)  (071-11A-0930)  V  ^ 

PERFORMANCE  MEASURES  (Product  Scored)  PASS  FAIL 

1.  Assembled  radio  to  include  the  antenna  _ 

and  antenna  base.  _  _ 

2.  Set  the  correct  frequency  _  _ 

3.  Turned  radio  on  so  that  it  can  transmit  _  _  4 

4.  Turned  volume  up  ao  t\at  radio  can  receive  _____  _ 

5.  Completed  the  task  in  2  minutes  ___  _ 

-  *  6»  Control  station  contacted  (by  scorer)  _  _ _  : 

7.  Did  not  damage  battery  (see  note  2)  _ 


STANDARD:  The  soldier  is  scored  GC  if  he  passes  all 
the  performance  measures.  If  he  fails  any  performance 
measure,  he  is  scored  NO-GC.  If  the  soldier  gets  a  NO-GO, 
record  on  the  scoresheet  any  additional  information 
required  to  show  the  cause  of  the  NO-GO. 


GO  NO-GO 


ADDITIONAL  REASON(S)  FOR  NO-GO: 


Scorer's  Signature 

NOTE  1:  At  the  end  of  2  minutes  when  "STOP"  is  given,  the  scot’er  con¬ 
ducting  the  test  will  go  around  the  semicircle  to  each  radio  and  conduct 
a  radio  check  with  the  control  station  monitored  by  the  SA,  If  the 
station  can  be  contacted,  performance  measure  No.  4  is  scored  PASS.  If 
the  control  station  cannot  be  contacted,  the  scorer  must  determine  If 
the  radio  was  properly  prepared  for  operation.  If  the  scorer  is  able  to 
place  the  radio  into  operati  n  by  correcting  an  error  made  by  the  exam¬ 
inee,  performance  measure  Vo  4  ia  scored  FAIL.  If  the  scorer  determines 
that  the  radio  is  unserviceable  through  no  fault  of  the  examinee,  the 
examinee  will  be  retested. 

NOTE  2:  After  the  radio  check,  the  scorer  will  physically  check  the 
radio  battery  to  determine  if  the  baitery  has  been  broken.  If  the 
battery  is  broken,  the  soldier  will  receive  a  NO-GO. 


FIGURE  l 
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Administrative!  Considerations 


.  When  compared  to  routinely  administered  written  tests,  the  hands-on 
test  presents  a  considerable  challenge  to  test  administration.  In 
order  to  reduce  problems  associated  with  administrative  procedures, 
the  hands-on  test  undergoes  field  tryouts  during  development.  Test 
developers  use  tryouts  to  check  the  set;-up  of  the  test ,  the  clarity 
of  the  instructions,  the  acceptability  of  the  test  to  soldiers  and  to 
experts  add  the  feasibility  of  the  test. 


A  common  problem  encountered  is  the  failure  of  equipment.  In  order 
to  deal  with  the  problem,  all  instructions  say  that  the  examinee  has  the 
option  to  stop  the  test  at  any  point  if  a  malfunction  in  the  equipment 
is  suspected.  If  there  actually  is  a  malfunction, the  examinee  is  not 
penalized.  However,  if  there  is  no  malfunction,  the  examinee  is 
penalized.  While  some  may  consider  this  requirement  harsh,  it  nevtr- 
theless  represents  a  realistic  evaluation  of  performance,  because  deciding 
if  equipment  is  functional  is  s  part  of  every  task.  The  scorer  normally 
would  not  purposely  program  a  fault  into  a  piece  of  equipment,  and  makes 
every  effort  to  insure  that  the  equipment  is  functional.  The  only 
exceptions  to  this  are  in  cases  where  the  task  requires  the  examinee  to 
discover  some  fault  in  the  equipment.  Nany  tests  make  use  of  these 
so-called  "troubleshooting"  tasks.  The  only  information  the  examinee  has 
is  that  there  is  a  flaw  in  the  equipment.  He/ she  of  course,  has  no  Ides 
what  flaw  was  programmed  into  the  equipment.  After  the  test  is  written 
and  published,  rehearsal  is  used  so  that  administrators  can  become 
familiar  with  the  test.  The  rehearsal  is  done  as  close  as  possible  to 
the  actual  tesr  date  so  that  scorers  and  administrators  will  have  recent 
experience  with  administration  of  the  hands-on  tost.  This  also  gives 
scorers  and  administrators  time  to  solve  unitp  e  problems  »r.<i  clarity 
instructions. 


Another  requirement,  that  of  having  scorers  be  of  the  same  or  a 
related  MOS  of  those  being  tested,  also  helps  reduce  the  possibility 
of  scorer  error.  When  the  scorers  have  familiarity  with  the  task,  they 
are  better  able  to  identify  and  solve  problems. 

Another  aspect  of  ITED's  plan  to  reduce  and  solve  testing  problems 
is  the  use  of  a  special  hot  line  telephone  number  that  test  administrators 
can  call  to  ask  questions  or  get  information  about  the  test.  However, 
even  this  is  of  limited  use  when  the  test  is  in  progress  or  it  takes 
time  to  find  the  answer  to  a  problem.  While  much  can  be  done  to  reduce 
the  occurance  of  problems  during  a  test  there  is,  of  course,  no  way 
that  all  problems  can  be  foreseen  and  planned  for  in  advance.  For 
this  reason,  it  has  been  a  policy  of  1TE0  that  no  soldier  should  be 
penalized  because  of  a  problem  that  is  beyond  his  control. 


If  some  unforseen  problems  appears,  such  as  equipment  failure, 
equipsient  unavailability  or  weather  change,  the  soldier  receives  a 
score  of  "not  rated".  A  "not  rated"  does  not  help  or  hurt  a  soldier 
in  his/her  overall  testing  program,  the  "not  rated"  provides  a  useful 
option  in  situations  that  are  beyond  the  control  of  examinees  and 
administrators.  In  the  Infantry  tests  only  It  of  the  soldiers  received 
a  "not  rated".  A  statistic  which  may  suggest  that  many  of  the  field 
problems  have  been  solved. 

Another  administrative  problem  is  the  manpower  required  for  hands-on 
testing.  This  problem  will  probably  never  be  solved  completely  to 
a  commander’s  satisfaction  because  he/she  must  provide  numerous  personnel 
to  administer  and  score  the  test.  However,  if  the  testing  provides 
useful  information  and  really  enhances  the  effectiveness  of  training  the 
manpower  requirement  becomes  a  much  more  tolerable  problem. 
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RELIABILITY  AND  VALIDITY 


Reliability  and  validity  are  beyond  the  scope  of  the  paper,  bu\  It 
is  important  to  mention  hov  they  are  assessed.  Perhaps,  the  most  impor¬ 
tant  aspect  of  reliability  to  consider  is  inter-rater  or  scorer  reli¬ 
ability.  Inter-rater  reliability  is  assessed  empirically  and  must  be 
at  least  80%  before  a  scoresheet  'is  accepted. 

The  other  aspects  of  reliability,  i.e.,  stability,  equivalency, 
and  homogenity,  have  not  bon  considered  because  they  are  not  germaln  to 
hands-on  testing.  Moreover,  the  fact  that  the  hands-on  test  is  a 
criterion-referenced  test  argues  for  not  considering  these  aspects  of 
reliability* 

A  more  pertinent  consideration  is  that  of  test  validity.  Content 
validity  is. established  for  all  hand8*on  tests  by  having  experts  score 
a  hands-on  test.  If  75%  of  these  so-called  experts  agree  that  the  task 
is  an  important  part  of  their  MOS  and  is  a  fair  way  to  find  out  if  a 
soldier  can  do  the  task,  the  test  is  deemed  to  have  content  validity. 
While  not  an  empirical  method  of  assessing  validity  the  method  provides 
logical  validity  and  insures  that  the  task  is  representative  of  an  MDS. 
It  also  insures  that  soldiers  who  will  take  the  test  perceive  it  as  a 
relevant  and  fair  form  of  testing. 

In  conclusion,  it  should  be  noted  that  the  Army  is  committed  to 
hands-on  testing  and  will  work  to  solve  the  problems  associated  with 
performance  testing. 
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OVERVIEW  OF  SQT  DEVELOPMENT  WORKSHOP 


The  Training  Developamnts  InatlCut*  (TDI),  tha  Individual  Training 
and  Evaluation  Dirac torate  (ITED),  and  tha  Hunan  Raaourcaa  Raaaarch 
Organisation  (HunRIO) ,  ara  conducting  workshops  that  prasant  tha  basic 
principles  of  developing  criterion-referenced  teats  as  tha  principles 
apply  to  daveloping  a  Skill  Qualification  Taut  (SQT)  •  The  workshops 
were  developed  by  HunRIO  under  contract  with  the  Any  Research  Insti¬ 
tute  (ARI). 

In  this  paper  I  will  describe  the  need  for  such  a  workshop,  the 
constraints  on  the  workshop,  and  the  characteristics  of  the  workshop. 


NEED 


The  SQT  is  a  highly  conplex  ays  tan  in  fom,  development,  and 
adniniatration. 

SQT  is  a  criterion- referenced  test;  that  is,  performance  on  tha 
test  is  neasured  against  a  standard  determined  in  advance  by  an 
analysis  of  job  performance  requirements.  Criterion-referenced  testing, 
in  various  forms  and  under  various  names,  has  bean  around  for  a  long 
time.  But  tha  SQT  is,  in  form,  development,  and  administration,  dif¬ 
ferent  from  other  forms  of  criterion-referenced  systems.  The  need 
for  training,  as  provided  by  the  workshop,  arises  from  these  differences 
and  the  problems  associated  with  then. 

In  form,  the  principal  difference  is  in  the  three  components,  or 
types  of  test  mode,  within  an  SQT.  An  SQT  consists  of: 

Hands-On  Component  (HOC) 

Performance  Certification  Component  (PCC) 

Written  Component  (VC) 

The  Hands-On  Component  represents  the  most  common  type  of  criterion- 
referenced  testing.  The  examinee  performs  a  task,  or  part  of  a  task, 
under  standardised  conditions,  and  is  evaluated  against  a  standard  of 
job  performance.  Evan  though  hands-on  testing  is  most  dearly  criterlon- 
referenced,  an  SQT  may  hava  no  hands-on  component. 

The  Performance  Certification  Component  also  resembles  traditional 
criterion- referenced  testing.  The  conditions  under  which  the  test  is 
performed  are  not  always  standardised,  but  the  allowable  range  of  con¬ 
ditions  is  specified.  A  soldier's  performance  is  usually  evaluated  by  a 
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supervisor  as  hs  performs  the  task  on  the  job.  The  PCC  Is  dearly 
criterion-referenced:  performance  is  aeaaured  against  a  standard. 

But  soae  SQT  will  have  no  PCC. 

The  third  coaponent  of  the  SQT  is  the  Written  Coaponent.  In 
appearance,  it  reeeablse  any  other  written  test:  there  are  ltea  a teas 
and  alternative  responses,  and  the  examinee  chooses  the  rlgit  response 
or  responses.  But  where  aost  written  tests  saaple  a  domain  of  knowledges 
the  Written  Coaponent  is  criterion- referenced:  the  examinee  parforas 
portions  of  tasks,  or  indicates  how  the  task  is  performed.  The  scorable 
units,  of  2-10  iteas  each,  cover  discrete  tasks,  and  the  ltea  alterna¬ 
tives  represent  the  actual  alternatives  that  the  soldier  encounters .on 
the  job.  The  exaainee  then  receives  a  score,  00  or  MO  CO,  for  each  task. 
All  SQT  will  have  a  written  coaponent. 

Thus  the  fora  of  an  SQT  is  complex.  Problems  arise  in  selecting 
tasks  to  be  tested,  adapting  taak  analysis  data  to  make  than  suitable  for 
constructing  of  criterion-referenced  taste,  and  determining  which  of 
the  three  coaponents  is  beat  suited  for  testing  each  task.  "Ho*  we 
used  to  do  it"  sometimes  interferes  with  how  it  must  be  done. 

In  addition  to  fora  of  SQT,  problems  also  arise  In  development. 
Development  of  the  SQT  Involves  much  more  than  the  construction  of  the 
test  for  each  task.  The  primary  problem  in  development  arises  from  the 
requirement  that  an  SQT  must  be  validated.  Unlike  aost  criterion- 
referenced  testing  systems,  an  SQT  must;be  tried  out  for  reliability 
and  validity  before  it  can  be  administered  in  the  field.  Por  the  HOC, 
the  procedure  involves  checks  of  scorer  reliability  and  test  feasibility; 
for  the  PCC,  scorer  reliability,  test  feasibility,  and  systeaatlc  moni¬ 
toring  of  tasting  must  be  ensured;  for  the  WC,  predictive  and  content 
validity  are  of  concern.  The  procedures  for  validating  an  SQT  are  unique 

In  the  area  of  administration,  the  SQT  also  produces  sons  distinc¬ 
tive  requirements  and  problems.  Everyone  in  a  particular  MOS  skill 
level,  worldwide,  will  take  the  SQT  during  the  same  test  period.  Besides 
being  large  scale,  testing  is  also  decentralised.  Test  Control  Officers 
at  each  installation  will  conduct  SQTs.  Por  the  HOC  and  PCC,  this  means 
that  developers  must  prepare  very  precise  performance  measures,  test 
conditions,  and  instructions  to  the  Test  Control  Officers,  scorers,  and 
examinees.  Por  the  WC,  the  large  scale,  decentralised  testing  means  that 
every  response  must  wind  up  as  a  mark  on  a  machine-scored  answer  sheet. 

These  unique  characteristics  of  an  SQT,  in  fora,  development,  and 
administration,  have  created  considerable  problems.  Even  experienced 
test  developers,  even  experienced  criterion-referenced  test  developers, 
have  found  that  developing  an  SQT  la  not  an  easy  job.  And  at  most  Test 
Development  Agencies  (TEA) ,  the  SQT  developers  are  not  test  experts,  but 
subject  matter  experts.  Their  expertise  is  vital  in  SQT  development, 
but  not  sufficient. 


Early  in  the  history  of  SQT  development,  ITED  became  aware  of 
recurring  problems  on  nine  major  tasks  or  aspects  of  the  test  developer's 
jobs 


Select  Tasks  for  Testing 

Review  Task  Analysis 

Allocate  Tasks  to  Components 

Construct  Hands-On  Component 

Tryout  Hands-On  Component 

Construct  Written  Component 

Validate  Written  Component 

Construct  Performance  Certification  Component 

Prepare  SQT  Notice 

Cul dance  on  the  nine  tasks  was  published  as  the  Guidelines  for 
Development  of  Skill  Qualification  Tests.  In  addition  to  this  document, 
however,  a  need  vt«  perceived  for  a  controlled,  systematic  approach  to 
training  and  assisting  individual  developers  in  the  implementation  of 
the  principles  contained  in  the  Guidelines.  The  SQT  Development  Work¬ 
shop  was  proposed  as  a  means  to  provide  monitored  practice  in  the  skills 
involved  In  the  nine  tasks. 

The  overall  objective  for  the  Workshop  is  to  prepare  people  at  Test 
Development  Agencies  to  perform  these  nine  tasks,  and  to  apply  them  to 
their  ovn  SQT  development. 


CONSTRAINTS 

The  workshop  had  to  accommodate  three  constraints.  The  first  of 
these  is  that  it  had  to  be  exportable.  While  responsibility  for 
development  of  the  course  was  assumed  by  the  U.S.  Army  Training  and 
Doctrine  Commend  (TRADOC) ,  ultimately  the  implementation  of  the  course 
is  the  responsibility  of  the  individual  TSA.  TDA  traditionally  exper¬ 
ience  considerable  turnover  among  SQT  personnel.  TDA  must  be  able  to 
repeat  the  course  as  often  as  their  needs  dictate. 

The  decision  was  made  to  make  the  course  a  part  of  the  total  Faculty 
Development  Program  under  the  direction  of  TDI.  At  the  TDA  or  school 
level,  the  course  would  be  the  responsibility  of  the  Staff  and  Faculty 
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Development  section.  The  requirement  then,  was  that  the  course  be 
exportable  to  the  extent  that  it  could  be  taught  to  staff  end  faculty 
personnel  who  would  then  act  as  course  managers  at  their  TDA  end  con¬ 
duct  the  course  as  needed  to  neat  their  own  needs. 

The  second  requirement  was  that  the  course  be  self-paced.  While 
the  TRADOC  training  philosophy  incorporates  self-pacing  in  its  instruc¬ 
tional  model,  this  was  not  the  only  basis  for  this  requirement.  Persons 
who  are  assigned  to  SQT  development  have  a  variety  of  experience.  Some 
know  a  great  deal  about  testing  but  little  about  the  practical  limita¬ 
tions  of  SQT.  When  participants  come  to  the  workshop,  their  actual 
experience  with  SQT  ranges  from  absolutely  no  prior  exposure  to  SQT 
to  two  years  working  in  SQT  Development.  Thus,  as  the  need  to  learn 
about  various  aspects  of  SQT  varies,  the  workshop  had  to  allow  individuals 
to  work  at  their  own  pace. 

The  third  constraint  related  to  the  tine  of  the  workshop.  You 
may  have  heard  that  training  should  be  limited  only  by  the  amount  of 
tins  required  for  students  to  master  the  objectives.  You  may  even 
have  said  it.  But  thera  is  almost  always  an  outside  limit.  Por  this 
1  workshop  the  limit  is  two  weeks.  Managers  are  just  not  willing  to 

allow  people  to  be  away  from  their  desks  for  more  than  ten  days  to 
learn  to  develop  an  SQT. 

These  constraints  have  been  faced  by  other  developers.  As  part 
of  the  total  Faculty  Development  Program,  TD1  has  successfully 
implemented  a  Criterion-Referenced  Instruction  (CRT)  Workshop,  developed 
by  Msger  Associates.  This  CR1  workshop  has  bacons  the  basic  foundation 
for  a  family  of  staff  and  faculty  development  programs  which  will  pro¬ 
vide  the  necessary  in-house  training  capability  in  each  TRAPOC  training 
facility. 

According  to  the  CR1  modal,  the  overall  objective  for  a  training 
program  is  broken  down  into  subordinate  objectives,  and  training  is 
presented  in  modules  corresponding  to  these  subordinate  objectives. 

Within  some  limits,  participants  choose  the  soquence  in  which  they 
will  tackle  the  modules.  At  the  beginning  of  each  module,  the  objective 
for  that  module  is  stated,  the  criterion  test  is  described,  and  resource 
references  for  the  material  are  listed.  Each  participant  decides  indi¬ 
vidually  how  much  he  must  study  and  practice  to  pass  the  criterion  test. 

A  course  manager  monitors  student  progress,  avaluates  criterion  tests, 
and  servea  as  a  learning  resource  when  required  by  the  student. 

This  basic  framework  was  followed  for  development  of  the  SQT  Work¬ 
shop. 
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CHARACTERISTICS  OF  THE  WORKSHOP 

The  workshop  is  designed  for  worker-level  development  personnel, 
that  is,  ths  individual  who  actually  nuet  produce  an  SQT.  Because 
of  the  detail  in  which  the  naterial  is  presented,  it  is  not  intended 
for  senior  or  aost  Kiddle  nanagement  level  personnel. 


The  workshop  objective,  to  propers  people  to  apply  the  principles 
in  the  nine  tasks,  was  broken  out  into  34  subordinate  objectives. 

For  exaaple,  one  such  subordinate  objective^. (within  the  task.  "Construct 
Hands-On  Goaponent")  1st:  "Using  a-taakanalysisfor  ataakallocated  to> 
the  HOC,  construct  performance  Measures  for  process  scoring,  product 
scoring,  or  coafeination  scoring.'*  Thirty- four  aodules  were  prepared 
for  these  subordinate  objectives.  Bach  nodule  contains  explanatory 
text,  examples  and  practical  exercises .  The  exaaples  and  practical 
exercises  axe  for  the  aost  part  based  on  coanon  Military  tasks. 

Saaple  tasks  were  chosen  to  Illustrate  tha  principles  being  discussed, 
and  are  Intended  to  be  fanillar  to  aost  course  participants. 


For  each  nodule,  thare  is  also  a  criterion  test.  Each  criterion 
test  Involves  one  of  two  types  of  naterial.  In  sons  tests,  the  naterial 
is  standard,  that  is,  the  participant  is  given  a  situation,  task,  or 
other  information,  and  applies  the  concepts  put  forth  in  the  nodule 
to  satisfy  the  requlreaents  of  the  test.  In  others,  the  participant 
is  expected  to  work  with  a  task  and  naterial  of  his  choosing  from  the 
MOS  and  skill  level  that  he  will  be  working  with  during  SQT  developne&t. 

The  balance  between  the  two  types  of  naterial  was  not  easy  to 
achieve.  Standard  tests  are  anenable  to  very  specific  feedback,  and 
nake  the  role  of  the  course  nanager  easier.  However,  requiring  the 
developer  to  use  hit  own  naterial  helps  to  overcome  the  "My  MOS  is  dif¬ 
ferent"  syndrome  by  showing  developers  the  adaptability  of  the  course 
tinteri.il*.  In  this  way,  the  participant  develops  a  greater  appreciation 
of  the  flexibility  and  relevance  of  ths  principles  to  his  own  job. 
Approxinately  one-half  of  the  criterion  tests  Involve  developers  in 
using  their  own  tasks. 

In  the  workshop,  the  nine  najor  tasks  discussed  earlier  were  grouped 
into  seven  phases  of  skill  devslopnent.  These  seven  phases  are  necessary 
for  complete  devslopnent.  Although  enphasls  in  the  workshop  is  on  the 
individual  nodules,  not  the  phases,  in  the  tin*  rent  Icing  I  will  briefly 
outline  what  is  involved  in  each  of  the  seven  phases.  (See  Figure  1.) 


The  first  phase,  for  all  participants,  is  the  analysis  and  planning 
phase.  At  the  beginning  of  the  workshop,  participants  select  an  MOS 
with  which  to  work,  one  with  which  they  are  fanillar.  They  begin  with 
ten  tasks  from  one  skill  level  of  that  MOS.  In  one  nodule,  participants 
identify  sources  of  information  on  each  task  that  are  objective  Indicators 


■w^«M««e«(tt#**«rt«*S!t  iwtw«Ma)jRmw*Ji^mws«^  *«aswa3i*u«a*  .euw*w*«jr.K. 


of  need  for  evaluation.  Is  another  module,  participant*  group  the  tan 
task*  according  to  the  extent  of  known  perforate  deficiencies .  these 
modules  lead  participants  to  select  for  testing  those  tasks  which 
proedsa  the  grsa  tas  t  tpeyo  f  f  iin  f  tee  ting .  ^Froathetsmtasks,  -coarse 
■snager  then  selects  five  tasks  with  which  the  participant  continues  to 
work.  Then,  in  the  criterion  test  for  the  task  analysis  nodule,  par¬ 
ticipants  rsviaa  and,  if  necessary,  revise  existing  task  analysis  data 
for  those  five  teaks  to  neks  then  suitable  for  test  construction.  The 
final  nodule  in  the  analysis  and  planning  phase  covers  allocating  tasks 
to  components.  In  the  criterion  test,  participants  assign  each  of 
their  five  tasks  to  the  HOC,  the  PCC,  or  the  WC.  High  skill  physical 
tasks  are  allocated  to  the  HOC  or  the  PCC;  mental  iu«ks  and  low  skill 
j  physical  tasks  are  allocated  to  the  VC. 

After  participants  finish  th*  analysis  and  planning  nodules,  they 
I  branch  into  either  the  HOC  construction  phase  or  the  VC  construction 

j  phase.  During  the  construction  phases,  participants  work  with  the 

|  tasks  selected  earlier.  For  the  HOC  construction,  there  are  modules 

i  for  sons  preliminary  decisions  called  for  in  the  Guidelines.  Then 

they  work  on  nodules  which  require  that  they  construct  two  complete 
j  hands-on  rcorable  units,  to  include  performance  measures,  conditions, 

j  examinee  instructions  and  scores,  instructions. 

i  The  VC  construction  phase  also  requires  participants  to  write 

acorable  units  for  tasks  they  selected.  They  practice  constructing 
two  kinds  of  written  test:  written  performance  tests,  which  require 
1  examinees  to  perform  part  or  all  of  t  task,  and  performance-based 

tests,  which  require  examinees  to  answer  questions  about  how  a  task 
is  performed. 

) 

>  After  participants  finish  the  construction  phase  for  a  component, 

they  move  to  the  validation  phase  for  that  component.  Here,  the 
activities  and  criterion  tests  are  standardised,  and  addross  the  analysis 
of  data  and  revision  of  scorable  units  based  on  validation  results. 

Th*  HOC  validation  procedure  checks  luterrater  reliability, 
acceptability,  and  feasibility.  The  modules  cover  locating  faults  based 
on  a  tryout  with  experts,  computing  scortr  agreement,  checking  feasi¬ 
bility  of  a  scorable  unit,  constructing  s  station- load  table,  and 
revising  hands-on  scorable  units. 

Th*  VC  validation  procedure  checks  discriminant  validity  and 
acceptability.  Three  options  for  validation  are  available,  based  pri¬ 
marily  on  the  number  end  types  of  soldiers  to  which  the  developer  has 
access.  The  validation  modules  cover  collecting  self-ratings,  locating 
feulta  based  on  a  tryout  with  experts,  validating  written  scorable 
unite  against  hoode-on  tests,  selecting  e  validation  option  end  analysing 
date  on  each  of  the  three  options,  and  revising  written  scorable  units . 
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The  revision  modules  cep  eech  validation  phase.  I TED* a  policy 
in  regard  to  validation  is  that  the  results  give  a  basis  for  locating 
and  correcting  faults  in  a  test.  The  nodules  present  troubleshooting 
charts  for  hands-on  or  wri (ten  tests.  The  activities  and  the  criterion 
tests  present  sumarier  of  results  that  indicate  malfunctions  on 
given  items  or  performance  measures.  Using  the  troubleshooting  charts, 
participants  then  modify  the  test  to  correct  the  probable  causes  of 
the  malfunction.  These  modules  call  for  additional  practice  of  the 
skills  acquired  during  the  construction  phases. 

The  sixth  phase,  dealing  with  the  FCC,  focuses  not  only  on  the 
procedures  for  constructing  the  PCC  but  also  on  procedures  for  vsli- 
datlng  and  monitoring  it.  Participants  again  work  with  one  of  their 
own  tasks.  They  describe  how  the  teat  will  be  conducted,  how  it  will 
be  validated,  what  kinds  of  results  would  indicate  units  for  follow¬ 
up  checks,  and  how  the  checks  will  be  conducted. 

In  the  final  phase,  after  participants  have  developed  a  s cor able 
unit  for  each  component,  they  prepare  an  SQT  Notice.  This  is  primarily 
a  check  on  their  mastery  of  the  format  for  the  Notice.  It  aleo  pro¬ 
vides  a  neat  wrap-up  of  the  course. 

In  this  way,  workshop  participants  work  through  a  full  cycle  of 
SQT  development  in  about  ten  days.  Workshop  materials  will  ba  revissd 
baaed  on  our  experience  with  15  TDA  currently  being  trained.  >  action 
from  those  who  have  already  received  the  training  has  been  overwhelmingly 
favorable.  Even  individuals  who  have  had  no  previous  contact  with 
test  development  or  with  SQT  hsve  expressed  confidence  In  their  ability 
to  fit  into  the  SQT  system  after  taking  the  workshop.  Likewise,  par¬ 
ticipants,  who  hsve  had  prior  work  with  SQT  stata  that  the  workshop 
has  i  up  roved  their  skills  and  capaHlltlaa.  The  revised  workshop  will 
then  be  added  to  the  Staff  and  Faculty  Development  courses  at  the  TDA. 

The  concentrated  practical  work  in  this  workshop  will  thus  be  part  of 
the  on-going  TRADOC  support  of  TDA  to  accomplish  the  unique  goals  of 
SQT. 
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TAXONOMY  OF  TERMS  IN  JOB  ANALYSIS 

Major  J.  L.  Mitchell,  USAF 
USAF  Occupational  Measurement  Center 
lackland  AFB,  Texas 


The  term  !,Job  Analysis"  is  used  to  refer  to  the  whole  class  of 
studies  of  occupational  information  as  well  as  to  the  study  of  a  single 
job.  The  same  language  Is  used  to  discuss  job  data  used  for  a  variety 
of  purposes,  from  the  engineering  of  a  new  cockpit  design  to  the  break¬ 
down  of  tasks  into  elements  which  can  be  used  to  develop  criterion  objec¬ 
tives  for  training.  Such  a  diversity  of  usage  for  the  term  "Job  Analysis" 
creates  some  difficulties  in  communication  and  sometimes  results  in 
evaluating  job  analysis  systems  which  are  designed  for  different  purposes 
and  have  radically  different  procedures.  This  is,  of  course,  absurd  and 
a  waste  of  valuable  research  time.  To  preclude  such  misunderstanding  and 
waste,  we  need  to  develop  and  consistently  use  a  more  refined  set  of 
categories  which  will  more  precisely  communicate  the  type  of  job  analysis 
and  its  purpose. 

Obviously,  as  an  initial  step  we  must  begin  by  defining  just  what  is 
meant  by  the  generic  term  "Job  Analysis".  McCormick  and  Tiffin  (1974)  have 
defined  it  in  the  following  way: 

"Job  analysis  can  be  considered  as  embracing  the  collection 
and  analysis  of  any  type  of  job-related  information,  by  any 
method,  for  any  purpose.  Perhaps  it  can  be  defined  more 
generally  as  the  study  of  human  work  (McCormick  &  Tiffin  1974:49).'' 

As  you  can  readily  see,  this  very  general  definition  is  so  broad  as  to 
Include  any  possible  type  of  study  involving  human  work  activities.  This 
would  encompass  the  entire  spectrum  of  occupational  studies;  from  the 
vaguely  worded  narrative  job  description  (which  has  been  particularly 
notable  in  Its  use  to  describe  higher  level  executive  and  management 
positions)  to  the  most  percisely  specified  and  quantified  job  analysis 
systems  used  for  the  mere  repetitive  or  consistent  positions  (those  which 
can  be  done  by  checklist). 


>  A  number  of  reviews  and  critiques  have  lamented  the  state  of  Job 

Analysis  in  this  country.  Kershner  (1955)  observed,  "As  is  patently 
evident,  job  analysis  has  been  a  sort  of  handmaiden  serving  in  various 
l  ways  a  variety  of  needs  and  all  the  while  floundering  in  a  morass  of 

semantic  confusion."  This  1v  a  bleak  picture  Indeed,  but  is  probably  an 
l  accurate  picture  of  the  state  of  Job  Analysis  at  the  time  Kershner  was 

l  writing  In  1955. 

j. 

a 

Prien  and  Ronan  in  a  1971  review  of  Job  Analysis  suggested  that  a 
f  considerable  amount  of  progress  had  been  made,  particularly  In  the 

|  military  services  and  in  some  of  the  more  quantified  job  analysis  systems 

|  such  as  that  developed  by  McCormick  in  the  Occupational  Research  Center  at 

f;  Purdue  University.  While  they  cited  six  major  research  areas  which 

h  remained  to  be  resolved,  Prien  and  Ronan  seem  to  believe  that  a  credible 


amount  of  progress  had  been  made  In  the  decade  and  a  half  since  Kershner’s 
critical  comments 
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In  a  more  recent  review,  McCormick  examined  the  area  of  Job  and  Task 
Analysis  for  the  new  Handbook  of  Industrial  and  Organizational  Psychology. 

In  evaluating  the  overall  area7~McCormi ck  concluded  that , 

"...  with  some  notable  exceptions,  the  study  of  human  work 
has  generally  been  more  in  the  domain  of  the  arts  than  of 
the  sciences.  Perhaps  to  express  it  differently,  the  study 
of  human  work  (which  occupies  a  major  part  of  man's  life¬ 
time)  probably  has  not  generally  benefited  from  the  systematic, 
scientific  approaches  that  have  been  characteristic  of  other 
domains  of  inquiry,  such  as  the  study  of  physical  phenomena, 
biological  phenomena,  or  of  the  behavior  of  man  himself  (as 
through  psychological  and  sociological  research)."  (McCormick 
1976:654) 

The  few  "bright  spots"  which  McCormick  sees  in  this  whole  area 
include  the  work  done  by  the  US  Training  and  Employment  Service  (UST&ES) 
in  occupational  classification,  the  work  of  task  analysis  done  in  the 
military  services  (and  particularly  the  development  of  CODAP  by  the 
Air  Force  Human  Resources  Lab),  and  the  few  civilian  systems  which  focus 
on  the  quantification  of  job  Information  (such  as  McCormick's  own 
Position  Analysis  Questionnaire  system). 

At  the  risk  of  being  accused  of  bias,  I  would  assert  that  the 
occupational  research  program  of  the  military  services  is  perhaps  the 
brightest  of  the  "bright  spots"  which  McCormick  outlined.  Of  course 
it  Is  impossible  to  evaluate  these  three  systems  or  approaches  since 
they  are  designed  for  different  purposes,  use  entirely  different  procedures, 
and  serve  different  populations.  However,  I  think  it  would  be  accurate  to 
say  that  the  CODAP-based  military  job  analysis  systems  have  probably  had 
the  greatest  impact  In  the  sense  of  terms  of  providing  a  data  base  and 
systematic  analysis  of  that  data  base  to  assist  managers  in  making 
decisions  concerning  how  jobs  are  structured,  how  they  should  be  organized 
and  defined,  and  what  training  Is  both  necessary  and  relevant. 

As  early  as  1959,  Dr  Carroll  l.  Shartle,  who  is  perhaps  the  "grand¬ 
father"  of  occupational  studies  in  this  country,  wrote  that, 

"Occupational  information  in  the  armed  services  has  continued 
to  develop  and  today  the  Department  of  Defense  has  one  of  the 
largest  programs  In  developing  and  applying  occupational  infor¬ 
mation  in  the  world.  Technological  changes  make  it  necessary 
that  occupational  research  be  continued  vigorously  (Shartle  1959:8)." 

Shartle  also  developed  a  basic  taxonomy  of  terms  used  in  the  area  of 
Job  Analysis.  Recognizing  that  our  normal  terminology  dealing  with  jobs 
is  often  very  loosely  used  In  practice,  he  set  forth  certain  definitions 
of  terms  in  order  to  establish  a  more  realistic  framework  for  our  study 
of  occupations.  His  definitions  are  as  follow: 


Career.  A  career  covers  a  sequence  of  positions,  jobs,  or 
occupations  that  one  person  engages  in  during  his  working 
life. 

Occupation.  An  occupation  is  a  group  of  similar  jobs  found 
Tn  several  establishments. 

Job.  A  job  is  a  group  of  similar  positions  in  a  single  plant. 
Business  establishment,  educational  institution,  or  other 
organization.  There  may  be  one  or  many  persons  employed  in 
the  same  job. 

Position.  A  position  is  a  group  of  tasks  performed  by  one 
person.  There  are  as  many  positions  as  there  are  workers  in 
the  organization. 

Positions  could  be  further  broken  down  into  tasks  (those  specific 
activities  which  taken  together,  make  up  the  position)  and  elements 
(those  very  specific  actions  which  comprise  a  task). 

These  terms,  by  and  large,  provide  a  comprehensive  taxonomy. 

They  are  fairly  consistently  used  in  the  military  services,  with  the 
exception  that  we  tend  to  collapse  the  first  two  categories  into  one 
when  we  talk  of  Military  Occupational  Specialties  (MOS)  or,  In  the 
Air  Force,  of  Career  Fields.  Thus  we  tend  to  use  the  terms  "career 
field",  "specialty",  or  "occupational  area"  interchangeably  to  refer 
to  what  Shartle  defines  as  Occupations  and  Career*.  Vie  should,  perhaps, 
be  more  consistent  and  use  his  terms  in  order  to  differentiate  between 
groupings  of  related  jobs  (an  Occupation)  and  what  an  Individual  does 
during  his  time  in  the  military  service  (a  Career). 

While  this  taxonomy  of  terms  appears  simple  and  straightforward,  we 
have  already  seen  that  there  Is  some  degree  of  confusion  in  practical  use. 
One  of  the  problems  is  that  we  use  the  term  "Job  Analysis'-  to  refer 
generically  to  the  entire  area  of  occupational  information.  At  the  same 
time,  we  also  use  It  to  refer  specifically  to  the  analysis  of  a  group  of 
related  positions  within  a  given  work  context  or  organization.  In  the 
Air  Force,  we  tend  to  use  the  term  "Job  Description"  regardless  of  the 
level  of  groups  we  are  talking  about.  With  CODAP  we  can  generate 
quantative  "Job  Descriptions'1  for  an  individual,  for  groups  of  related 
positions  (where  is  would  be  most  appropriate),  and  even  across  groups 
to  cover  entire  occupational  areas.  We  would  be  more  percise  if  we  used 
Position  Description  for  describing  tasks  involved  in  a  unique  individual's 
work,  Job  Description  for  outlining  tasks  for  groups  of  related  positions, 
and  Occupation  Description  for  summarizing  the  tasks  involved  across 
related  job  groups.  This  useage  would  let  us  grasp  much  more  quickly  the 
exact  level  of  our  descriptions,  and  could  more  percisely  communicate  the 
level  cf  our  analysis. 
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A  very  serious  compounding  factor,  however.  Is  the  use  to  which  the 
analysis  Is  to  be  put.  There  Is  probably  more  variance  In  procedures, 
type  of  analysis,  and  meaning  associated  with  what  we  plan  to  do  with  the 
data  than  there  Is  with  the  level  of  grouping.  A  task,  job,  or  occupational 
analysis  for  the  purpose  of  engineering  design  of  a  new  weapons  system 
cockpit  is  necessarily  considerably  more  detailed  than  would  be  required  If 
the  analysis  is  to  be  used  only  for  examining  the  job  classification  system. 
And  yet,  at  present,  we  continue  to  use  the  terms  Job,  Task,  or  Occupational 
Analysis  as  If  they  are  equivalent  regardless  of  the  purpose  of  the  research. 

We  need  to  look,  at  least  In  a  very  general  way,  at  the  various  possible 
uses  of  Job  Analysis  data.  McCormick  and  Tiffin  (1974)  have  published  a 
suninary  table  which  very  concisely  summarizes  some  of  the  uses  of  the  type 
of  information  which  Is  normally  obtained  through  job  and  task  analysis. 

Their  summary  with  some  modifications,  is  shown  as  Table  1.  As  you  can  see 
from  this  display,  the  possible  uses  of  job  Information  are  quite  varied 
and  implicitly  demand  quite  different  kinds  of  Information.  The  kinds  of 
job  Information  needed  for  making  personnel  selection  and  placement  decisions 
must  necessarily  be  considerably  different  than  would  be  needed  for  equipment 
design,  although  obviously  they  are  to  a  degree  related  or  Interactive.  The 
kinds  of  people  available  for  selection  does  Imapct  on  how  you  can  design  the 
equipment  to  be  used  on  a  job  and  the  reverse  Is  also  true. 

While  we  could  discuss  these  various  uses  in  considerable  detail,  it 
Is  more  worthwhile  here  to  simply  note  them  and  to  understand  that  each 
possible  use  of  job  Information  has  its  own  unique  requirements.  While  they 
may  be  related  to  one  another  to  a  small  or  large  degree,  the  Information 
needed  Is  by  no  means  Identical.  The  same  Is  true  of  the  levels  discussed 
earlier.  The  types  of  Information  needed  to  adequately  describe  a  specific 
position  are  not  necessarily  what  Is  needed  to  properly  characterize  an 
occupation.  The  former  by  necessity  must  be  much  more  specific  and  detailed 
than  the  latter.  Further,  what  can  be  done  with  the  data  Is  also  relevant. 

It  Is  useless  to  assess  the  similarity  of  a  position  with  Itself;  obviously 
it  is  Identical.  However,  such  a  contrast  In  terms  of  similarity  is  just 
what  is  needed  when  we  study  job  groups  or  when  we  wish  to  assess  the 
degree  of  relationship  between  various  occupational  groups  or  career 
patterns.  Thus,  both  the  level  and  the  purpose  of  the  analysis  are  relevant 
and  we  need  some  way  to  communicate  both  of  these  In  our  taxonomy  of  terms. 

Before  proceeding  to  an  obvious  solution  to  these  taxonomic  problems, 
it  might  be  worthwhile  to  cite  at  least  one  Instance  where  this  type  of 
basic  definitional  difference  has  resulted  in  a  serious  scientific  problem. 
Recently,  a  draft  paper  by  a  Navy  research  contractor  has  been  circulating 
among  the  various  military  organizations  Involved  with  occupational  analysis. 
The  main  point  of  this  paper  was  to  compare  the  relative  cost  effectiveness 
of  the  contractor's  method  of  determining  Navy  course  content  requirements 
and  that  of  the  CODAP-based  USAF  Occupational  Survey  Program.  This 
researcher  asserted  that  he  could  achieve  the  same  objective  of  establishing 
what  needs  to  be  trained  by  jsing  a  job  analysis  expert  and  a  small  group  of 
training  specialists  qualified  In  the  given  occupation  without  needing  to  do 
an  expensive  survey  or  use  a  sophisticated  computer  program. 
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(Adapted  from  HeComkk  A  Tiffin  1974:  flq.  3.1;  piqe  49) 


This  report  could  be  challenged  on  a  number  of  points.  For  example, 
the  "improved"  criteria  of  success  is  simply  the  opinion  of  an  analyst 
and  a  trainer,  which  is  hardly  an  objective  criterion.  Further  the  contractor 
asserts  that  no  one  has  been  concerned  with  the  question  of  the  validity  of 
survey  data;  in  this  he  largely  ignores  the  concerns  expressed  by  Kershner 
(1955),  by  Prien  and  Ronan  (1971),  by  Ash  &  Kroeker  (1975),  and  by  McCormick 
(1976),  as  well  as  others. 

Quite  aside  from  such  criticisms  of  what  the  researcher  said,  a  much 
more  basic  issue  is  involved  when  his  entire  approach  attempts  to  compare  the 
results  of  a  training-specific  task  analysis  with  what  is  possible  with  an 
Air  Force  occupational  analysis,  where  the  objective  is  much  broader  in 
that  career  structure,  classification,  promotion  testing,  and  other  Issues 
as  well  as  training  are  Involved.  He  biases  his  results  by  essentially 
saying  that  by  focusing  on  one  particular  objective,  he  can  accomplish  it 
more  economically  than  is  possible  with  a  more  comprehensive  system.  In 
fact,  he  has  not  proven  even  that  even  though  he  Ignores  the  other  potential 
uses  of  job  information.  The  more  sophisticated  CODAP-based  system  can 
address  the  question  of  whether  training  should  be  given  at  all  which  is 
not  feasible  in  the  proposed  training  expert  system.  The  contractor  Ignores 
the  fact  that  changes  made  in  the  classification  system  also  have  a  direct 
impact  on  training  requirements. 

The  contractor's  system  also  makes  the  assumption  that  management  has 
already  made  most  of  the  necessary  decisions  as  to  what  the  training 
population  will  be  and  the  degree  of  proficiency  required,  and  related  Issues. 
He  completely  misses  the  point  that  the  USAF  occupational  survey  program  is 
designed  primarily  to  provide  Air  Force  management  with  the  information  to 
make  data-based  decisions  about  an  occupational  area.  As  a  spin  off  from 
gathering  such  information,  data  is  also  available  to  be  used  in  training 
design  and  decisions.  Additionally,  within  the  Air  Force  system,  a  procedure 
much  like  that  proposed  is  used  when  decisions  are  to  be  made  about  specific 
training  content,  except  that  in  the  Air  Force  program  we  are  currently 
having  technical  representatives  from  using  commands  (those  who  use  our 
technical  training  graduates)  assist  in  this  type  of  decision  making 
("Scrubdown"  or  "Technical  Training  Systems  Review"  Projects). 

Finally,  the  Navy  contractor  asserts  that  decisions  made  in  his 
system  by  an  expert  and  the  trainers  should  be  used  as  the  basis  for 
designing  occupational  structures  -  that  is,  that  training  should  drive  the 
classification  system.  This,  of  course,  goes  much  beyond  his  limited  data 
(which  is  largely  anecdotal).  The  Air  Force  system  is  predicated  on  just 
the  opposite  assumption  -  It  Is  necessary  first  to  determine  how  people  are 
or  should  be  utilized  using  the  most  complete  data  base  available  (which 
Includes  occupational  survey  data,  the  expertise  of  subject-matter  specialists, 
the  experience  of  senior  managers,  and  any  other  relevant  data).  Once  the 
utilization  pattern  has  been  established,  then  the  occupational  structure 
and  the  training  will  follow.  Rather  than  expecting  management  to  make  all 
the  decisions  needed  to  specify  training  programs,  the  Air  Force  occupational 
survey  system  is  designed  to  assist  those  managers  by  providing  a  data  base 
reflecting  current  utilization  for  use  as  a  starting  point  in  the  decision 
process. 
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This  brief  review  of  a  contractor's  research  Is  cited  here  as  perhaps 
a  rather  extreme  example  of  attempting  to  compare  a  "job  analysis"  for  a 
limited  purpose  (training)  with  a  "job  analysis"  designed  for  a  higher  level  of 
generality  and  for  various  purposes.  If  we  used  more  percise  definitions, 
it  is  evident  that  he  attempts  to  contrast  a  task  analysis  for  training  with 
an  occupational  analysis  which  is  multipurpose  (classification,  structure, 
training,  etc.).  Stated  in  these  more  precise  terms,  his  attempt  is  both 
unrealistic  and  unscientific.  It  does,  however,  serve  as  an  excellent 
example  of  why  we  need  to  be  more  explicit  and  consistent  in  our  use  of 
Job  Analysis  terminology. 

One  obvious  solution  for  this  need  for  more  percise  language  is  to 
use  a  composite  title  which  would  include  both  the  level  and  the  purpose  of 
the  analysis.  Table  2  displays  a  matrix  of  the  levels  of  analysis  as  well 
as  some  of  the  uses  of  job  analysis.  A  few  of  the  cells  <n  this  table  have 
been  filled  in  to  illustrate  the  type  of  terminology  which  could  be  used  to 
very  concisely  depict  the  type  of  analysis  being  undertaken.  Thus,  a  task 
analysis  being  done  specifically  to  determine  training  requirements  would 
be  a  "Training  Task  Analysis"  while  a  task  analysis  being  done  for  the 
purpose  of  human  engineering  would  be  an  "Engineering  Task  Analysis".  This 
system  would  be  most  useful  at  the  more  specific  levels  and  in  those  cases 
where  only  one  purpose  is  to  be  met. 

For  the  higher  levels  of  analysis  (particularly  in  the  occupational 
analysis  or  career  analysis  categories),  it  may  be  best  to  drop  the  specific 
purpose  designation,  especially  where  there  is  more  than  one  purpose  to  be 
served.  Thus,  where  no  qualifier  is  included  in  the  title,  we  could  assume 
a  multipurpose  study.  Where  only  a  single  purpose  is  undertaken,  such  as  the 
study  of  job  satisfaction  then  a  more  specific  title  would  still  be 
appropriate  (a  Career  Satisfaction  Analysis?  etc.). 

While  a  more  flexible  and  refined  taxonomy  of  terms  will  certainly  not 
solve  many  of  the  complex  and  difficult  problems  of  the  study  of  human  work, 
it  Is  a  starting  point  for  more  percise  communication  about  the  kinds  of 
analysis  we  do.  As  such,  it  represents  a  better  way  of  doing  business  and 
is  therefore  proposed  for  your  consideration  and  use.  Such  a  system  has 
much  to  commend  its  use  and  should  provide  a  foundation  on  which  we  can 
build  a  more  comprehensive  dialogue  among  ourselves  and  with  the  managers  whose 
need  for  data-based  decisions  we  are  in  the  business  to  serve. 
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Some  years  after  he  left  the  White  House,  President  Truman 

IN  REPLY  TO  A  QUESTION,  STATED  THAT  HE  WAS  THE  OLDEST  LIVING 
EX-PRESIDENT  AND  HE  DESIRED  TO  REMAIN  THAT  WAY  AS  LONG  AS 
POSSIBLE,  As  A  CHARTER  MEMBER  OF  THE  MTA,  MY  FEELINGS  PARALLEL 
HIS.  I  WOULD  LIKE  TO  REMAIN  A  LIVING  CHARTER  MEMBER  AS  LONG  AS 
POSSIBLE. 

There  is  another  organization  to  which  this  parallel  applies 
the  Pearl  Harbor  Survivors  Association  -  it  is  the  only  outfit 

THAT  I  BELONG  TO  THAT  DOES  NOT  HAVE  AMY  INPUT  FROM  THE  BOTTOM  - 
AND  THE  MEMBERS  AT  THE  TOP  ARE  GRADUALLY  DISAPPEARING.  YOU 
MIGHT  SAY  WE  HAVE  NO  UPWARD  MOBILITY  PROGRAM,  OR  PARADOXICALLY 
WE  HAVE  EMBRACED  THE  UP  OR  OUT  POLICY  TO  ITS  LOGICAL,  ULTIMATE 
RESTING  PLACE. 

SO  I  AM  HAPPY  TO  NOTE  THE  YOUNGER  MEMBERS  OF  THE  MTA, 
ESPECIALLY  THOSE  WITH  THAT  EAGER  LOOK  ABOUT  THEM.  At  TIMES 
THAT  EAGER  LOOK  HAS  TROUBLED  ME  -  I  AM  NEVER  QUITE  CERTAIN  IF 
THEY  ARE  EAGER  TO  AVOID  THE  MISTAKES  WE  MADE,  OR  EAGER  TO 
REPEAT  THEM.  PERHAPS  I  MAY  BE  ABLE  TO  HELP  SOME  OF  YOU  AVOID 
SOME  MISTAKES  OR  ONLY  PARTIALLY  RE-liWENT  THE  WHEEL. 


L, 


SLIDE  LON 


I  MILL  FOLLOW  THE  ABSTRACT  AS  SUBMITTED  -  HOW  AND  WHEN 
NOTAP  STARTED/  CONDITIONS  EXISTING  AT  THE  TIME  -  ASSUMPTIONS 
AND  CONCLUSIONS  MADE  IN  THE  EARLY  DAYS/  BOTH  ACCURATE  AND 
INACCURATE  AS  PROVED  BY  SUBSEQUENT  OPERATIONS/  PRESENT  STATUS 
OF  THE  PROGRAM  AND  A  LOOK  AT  POSSIBLE  FUTURE  USES  OF  OCCUPA¬ 
TIONAL  DATA. 


SLIDE  2  ON 


NOTAP  was  conceived  by  the  SECNAV  Task  Forc^'  on  retention 
which  reported  out  in  1966.  The  exact  recommendation  was  as 

INDICATED  ON  THIS  SLIDE. 

SL1DEJLDEE 


SLIDE.  1 QN 


The  history  of  the  program  is  as  follows:  (Speak  to  slide) 

SUDEJLQEE 

What  were  the  conditions  existing  in  the  late  sixties  and 

EARLY  SEVENTIES?  THE  Navy's  UNIQUE  MANPOWER  CHARACTERISTICS 
WERE  AS  LISTED  ON  THIS  SLIDE. 


ml±m 

WE  HAD  AN  EXPANDING  NAVY  THEN  A  CONTRACTING  NAVY.  In  THE 
EARLY  SEVENTIES  THE  WASHINGTON  PAPERS  WERE  CARRYING  ARTICLES 
ABOUT  MOVING  OUT  25Z  OF  FEDERAL  GOIERNMENT  PERSONNEL  BASED 
THERE, 

It  was  not  the  time  and  puce  for  a  new  program  to  go 

OPERATIONAL  IN  THE  WASHINGTON  AREA  BUT 4  OBVIOUSLY  WE  MADE  IT, 

SLIDE  U  OFF 

SLIDE  sm. 

To  REFRESH  YOUR  MEMORY,  THIS  SLIDE  IS  A  COMPOSITE  OF  ALL 
THE  INFORMATION  WE  COLLECT.  (SPEAK  TO  SLIDE) 

SUDLSJEE 

What  were  some  of  our  early  recommendations  and  conclusions? 
Some  items: 

1.  QPS£R.VAIION/.lliI£RYI£Hi  PRIOR  TO  CONSTRUCTING  A  TASK 
INVENTORY  THE  TEAM  ASSIGNED  INTERVIEWS  A  SMALL  BUT  HIGHLY 
REPRESENTATIVE  SAMPLE  OF  PERSONNEL  IN  THE  RATING  -  ANYWHERE 
FROM  40  TO  150  INCUMBENTS.  THIS  COSTS  MONEY,  ESPECIALLY 
H RAVEL  FUNDS.  In  ADDITION  TO  PROVIDING  TRAINING  FOR  OUR  TEAM, 

IT  HELPS  PIN-POINT  PROBLEM  AREAS  AND  PROVIDES  PEOPLE  TO  PEOPLE 
CONTACT.  .Most  IMPORTANT,  IT  increases  THE  CREDIBILITY  OF  OUR 
DATA.  A  PRUDENT  INVESTMENT  -  WE  THOUGHT  AT  ONE  TIME  WE  WOULD 
HAVE  HAD  A  TOUGHER  TIME  JUSTIFYING  THIS  EXPENDITURE. 


Data  Collection.  Originally,  we  thought  that  collecting  the 

DATA  BY  MAIL-OUTS  WOULD  BE  CHEAPER.  THE  USAGE  RATE  FOR  MAIL- 
OUTS  IS  ABOUT  llli  FOR-ON  SITE  ADMINISTRATION  BY  OUR  STAFF, 

it  is  99X  plus,  Considering  printing,  travel,  postage  and 

USAGE  RATE,  OUR  DETAILED  COST  STUDIES  HAVE  CLEARLY  SHOWN  THAT 
ON-SITE  ADMINISTRATION  IS  CHEAPER  AND  FASTER. 

Sample  Size.  Originally,  over  50X.  Now  between  20-30  percent. 

In  addition  to  ratings,  the  Navy  has  Navy  Enlisted  Classifica¬ 
tions  (called  NEC's)  which  are  mostly  a  secondary  level  of 

IDENTIFICATION  BUT  SOMETIMES  CAN  BE  A  TERTIARY.  SOME  RATINGS 
HAVE  OVER  40  SUCH  CODES.  WE  FAILED  TO  ANTICIPATE  THE  PRESENT 
DEMAND  FOR  DATA  BY  THESE  NEC'S.  As  YOU  KNOW,  THE  LARGER  THE 
NUMBER  OF  SUB-GROUPS,  THE  LARGER  THE  SAMPLE  SIZES.  On  THE 
OTHER  HAND,  THE  SMALLER  THE  SAMPLE  THE  LESS  INTERFERENCE  WITH 
THE  OPERATING  FORCES, 

Administrative  -  by  this  I  mean  the  amount  of  time  consumed  in 

MAKING  ARRANGEMENTS  FOR  THE  OBSERVATION  AND  INTERVIEW  AND  DATA 
COLLECTION.  #E  JUST  DIDN'T  ALLOW  ENOUGH  TIME  FOR  THIS  IN  OUR 
EARLY  ESTIMATES.  SHIPS  AND  SQUADRONS  ARE  MOBILE  AND  THE  NUMBER 
OF  SHIPS  HAS  DECREASED.  THUS  WE  VISIT  THE  SAME  SHIPS  MORE  OFTEN 
AND  WE  ARE  VERY  MINDFUL  THAT  THE  PRIMARY  MISSION  OF  THE  OPERATING 
FORCES  IS  NOT  COMPLETING  QUESTIONNAIRES, 


ft 


SlAEE.  He  anticipated  a  gradual  build-up  in  our  approved 

OPERATIONAL  ALLOWANCE/  BUT  SUDDENLY  WE  WENT  FROM  FAMINE  TC 
FEAST  -  AS  A  REfSULT  OUR  DATA  COLLECTION  CURVE  LOOKS  LIKE  THIS: 

SLIDE  6  QN  AND  .OFF 

This  is  presented  in  a  different  perspective  in  the  next  two 

SLIDES/  AND  Or  COURSE/  WILL  RESULT  IN  SOME  SCHEDULING  PROBLEMS 
IN  THE  FUTURE. 

SLIDES  7  &  8  ON  AND  OFF 


USL.aE.JATA 


In  some  areas  the  use  of  the  data  has  exceeded  expectations, 

!N  OTHERS,  IT  HAS  LAGGED  EXPECTATIONS. 

For  example,  about  four  years  ago  we  were  seriously  con- 

35i/£RIWG  DELETING  PHYSICAL  DEMANDS  FROM  OUR  TASK  INVENTORIES. 

NOW  THE  REACTION  !S  -  IS  THAT  ALL  YOU  HAVE?  In  TWO  RATINGS 
WE  ARE  CURRENTLY  COLLECTING  PHYSICAL  DEMANDS  DATA  AT  THE 
REOUEST  OF  PRDC,  SAN  DlEGO  TO  ASSIST  IN  A  LONGITUDINAL  PHYSICAL 
REQUIREMENTS  RESEARCH  PROJECT. 

The  initial  reaction  to  job  satisfaction/dissatisfaction 

WAS  MUCH  THE  SAME.  NOW  THE  PRINTOUTS  ON  JOB  SATISFACTION  ARE 
VERY  MUCH  IN  DEMAND.  PRDC,  SAN  DlEGO  HAS  TAKEN  OUR  DATA  ON 
FOUR  SAMPLE  RATINGS  AND  SUBJECTED  IT  TO  EXTENSIVE  FACTOR, 

MULTIPLE  REGRESSION  AND  CORRELATIONAL  ANALYSES.  BASED  ON  THEIR 
PRELIMINARY  FINDINGS  WE  ARE  MAKING  SOME  MODIFICATIONS  TO  THE 
JOB  SATISFACTION  PART  OF  OUR  TASK  INVENTORY, 

The  use  of  our  data  in  various  research  projects  has  exceeded 

EXPECTATIONS.  THE  OFFICE  OF  NAVAL  RESEARCH  AND  THE  VARIOUS 
SYSTEM  COMMANDS  FREQUENTLY  REFER  COMPANIES,  WITH  WHOM  THEY  HAVE 
CONTRACTS,  TO  US  BECAUSE  OF  THE  INFORMATION  IN  OUR  DATA  BANK. 

The  TRAINING  COMMAND  IS  THE  LARGEST  BULK  USER  OF  OUR  DATA. 

Our  RELATIONSHIP  HAS  BECOME  SO  INSTITUTIONALIZED  THAT,  AFTER 
COMPLETING  CERTAIN  INTERNAL  CHECKS,  WE  IMMEDIATELY  FORWARD  A 
LARGE  BOX  OF  PRINTOUTS  TO  THEM  AS  INDICATED  ON  THIS  SLIDE. 

SLIDE  9  Q;1 
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SLIDE  9  OFF 


Our  management  bureau,  the  Bureau  of  Naval  Personnel  tasks 
NOTAP  FOR  SPECIAL  STUDIES  WHEN  CHANGES  TO  THE  ENLISTED  RATING 

Structure  are  comtemplated. 

To  SUMMARIZE,  THE  DATA  IS  BEING  USED,  IN  VARYING  DEGREES, 

ON  MOST  PERSONNEL  MANAGEMENT  FUNCTIONAL  AREAS*  We,  OF  COURSE, 
WOULD  LIKE  TO  SEE  IT  USED  MORE  IN  CERTAIN  AREAS,  An  OBSERVATION 
-  MOST  PERCEIVED  PROBLEMS  FILTER  DOWN  FROM  THE  TOP,  NOT  UP 
FROM  THE  BOTTOM.  THUS,  IT  IS  DIFFICULT  TO  CONVINCE  A  MANAGER 
THAT  HE  MAY  HAVE  PROBLEMS  IN  THE  FUTURE  AND  YOU  HAVE  HELPFUL 
INFORMATION.  He  PROBABLY  HA;>  TOO  MANY  OTHER  CURRENT  PROBLEMS 
AND  HE  SECRETLY  HOPES  WHAT  YOU  DESCRIBE  MAY  NOT  COME  TO  PASS, 

AT  LEAST,  NOT  DURING  HIS  WATCH-  THUS,  THE  BIGGEST  ADVANCES  IN 
THE  USE  OF  THE  DATA  HAS  TAKEN  PLACE  AT  THOSE  TIMES  WHEN  A  REAL 
NEED  BECAME  APPARENT.  WE  ARE  GRATEFUL  FOR  THE  FORESIGHTED  AND 
STRONG  SUPPORT  GIVEN  TO  US  BY  OUR  PROGRAM  MANAGER  ON  THE  STAFF 

of  the  Chief,  Bureau  of  Naval  Personnel. 


717 


ElilllBE  JSES  OF  DATA 


In  addition  to  the  expanding  use  of  occupational  data 

BROUGHT  ABOUT  BY  REFINEMENTS  IN  COLLECTING  AND  PRESENTING 
THE  DATA,  WHAT  DOES  THE  FUTURE  LOOK  LIKE  FOR  NEW  APPLICATIONS 
OF  THE  DATA? 

SLIDE  IQ  DN 

There  are  two  areas  that  appear  promising.  These  are: 

1.  Occupational  health  and  safety. 

2.  Legal  cases  concerning  selection  procedures  and  job 

REQUIREMENTS. 

The  occupational  Health  and  Safety  Act  was  passed  in  1970, 
It  concerns  health  in  the  work  place.  A  forthcoming  executive 

ORDER  WILL  APPARENTLY  INCLUDE  UNIFORMED  PERSONNEL*  SPECIFICALLY 
UNDER  THIS  ACT, 

Personnel  of  the  Occupational  and  Preventive  Medicine 
Division  of  the  Bureau  of  Medicine  and  Surgery  were  interested 
in  what  Navy  ratings  were  exposed  to  certain  suspected  cancer- 
causing  agents. 

Accordingly*  they  convened  a  panel  of  Industrial  Hygienists 
and  Occupational  Health  Physicians  and  isolated  those  tasks 

FROM  N0TAP  JOB  INVENTORIES  (RATING  RELATED)  THAT  WERE  RELATED 
TO  CERTAIN  SUSPECTED  CAUSATIVE  AGENTS.  pROM  THE  COMPUTER  PRINT 
OUTS*  THE  PERCENT  OF  MEMBERS  WHO  PERFORMED  EACH  TASK  WAS  THEN 
EXTRACTED.  THE  SAME  PROCEDURE  WAS  FOLLOWED  FOR  EQUIPMENT  ITEMS 


FOLLOWING  OUR  BREAKOUT  OF  USE  Q&  REPAIR  AND  USE  AND  REPAIR. 

Last,  and  also  following  our  breakout,  the  data  were  broken 
out  by  Job  Titles,  Watch  Duties,  Collateral  Duties  and 
Physical  Demands.  Regretfully,  our  relative  time  statistics 

WERE  OF  LITTLE  VALUE  FOR  THIS  USE  OF  THE  DATA. 

In  THE  FUTURE,  WE  SHALL  PROBABLY  INCLUDE  SEVERAL  SPECIFIC 
TASK  STATEMENTS  THAT  WILL  BE  OF  VALUE  TO  THE  MEDICAL  PEOPLE. 

This  is  a  highly  sensitive  area.  Further,  job  incumbents 

ARE  OFTEN  UNAWARE  OF  ANY  LONG-RANGE  HEALTH  HAZARDS  IN  THEIR 
WORK  AREA. 

The  full  and  eventual  impact  of  this  Act,  as  amended,  is  a 

MATTER  OF  CONJECTURE.  IT  WILL  HAVE,  HOWEVER,  A  TREMENDOUS 
EFFECT  OH  OUR  ECONOMY  AS  HARMLESS,  SUBSTITUTE  MATERIALS  ARE 
DISCOVERED  AND  MANUFACTURED. 

With  regard  to  the  second  area  -  legal  cases  -  selection 

PROCEDURES  INCLUDE  APPLICATION  FORMS,  INTERVIEWS,  AND  REFERRALS 
AS  WELL  AS  TESTS.  SUCH  CRITERIA  MUST  BE  JOB  RELATED;  LIKEWISE, 
ANY  JOB  REQUIREMENT  MUST  ALSO  BE  JOB  RELATED, 

Of  COURSE,  WITH  REGARDS  TO  THE  PROVISIONS  OF  THE  PRIVACY 
Act,  AND  OTHER  LEGAL  CONSTRAINTS,  RELEVANCE  IS  LESS  RELEVANT i 
It  IS  NOT  ONLY  THE  ClVIL  RIGHTS  ACT  OF  196A  AND  SUBSEQUENT 
Executive  Orders  that  has  resulted  in  virtually  a  new  dimension 

IT  IS  ALL  MINORITIES,  WOMEN  AND,  CURRENTLY,  THE  AGED,  FOR 
EXAMPLE,  IF  A  WOMAN  CR  AN  ELDERLY  PERSON  TAKES  AND  FAILS  SOME 
SORT  OF  STRENGTH  OR  DEXTERITY  TEST,  AND  SUBSEQUENTLY  CHALLENGES 


SAID  TEST,#  MANAGEMENT  MUST  PROVE  THE  TEST  IS  REASONABLY  RELATED 
TO  SUCCESS  ON  THE  JOB  OR  IT  IS  A  REASONABLE  JOB  REQUIREMENT. 

SLIDE  IQ  m 

The  COURTS  HAVE,  in  the  past,  based  their  findings  on  test's 
VALIDITY  ON  THE  FOLLOWING:  SLIDE  U_  ON 

1.  Professionally  developed 

2.  Skill  related 

3.  Currency 

4.  Essential  -  safety,  efficiency,  morale? 

5.  Reasonable  -  business  related  purposes? 

The  relationship  of  occupational  task  analysis  to  most  of 
these  factors  is  obvious. 

SUDEJJLQE 

Any  job  requirements  must  be  job  related  in  the  same  context 
that  tests  are.  A  word  of  caution  -  job  task  analysis  indicates 

WHAT  WORKERS  AS£  DOING,  If  MANAGEMENT  CHANGES  THE  JOB  CONTENT 
(WHAT  WORKERS  SHOUlD  BE  DOING)  ANY  RESULTING  CHANGES  IN  JOB 
REQUIREMENTS  MUST  BE  VALIDATED  ALSO.  MANAGEMENT,  IF  THE  JOB 
ANALYSIS  SO  INDICATES,  SHOULD  EMPHASIZE  THE  NEGATIVE  ASPECTS 
IN  COMPARISON  WITH  THE  FIVE  CRITERIA  LISTED  ON  THE  SLIDE  (IN 
OTHER  WORDS  SHOW  DAMAGES  FROM  THE  EXISTING  CONDITIONS)  IN 
ADDITION  TO  VALIDATING  THE  CHANGES, 


In  conclusion,  several  broad  observations.  Many  of  you 

HERE  ARE  IN  RESEARCH  -  IN  THE  DEVELOPMENTAL  STAGE  WE  KNOW  IF 
THE  PROGRAM  WENT  OPERATIONAL  WE  WOULD  LIVE  WITH  OUR  MISTAKES. 
SO  A  GOOD  QUESTION  FOR  A  RESEARCHER  TO  ASK  HIMSELF  IS  *IF  I 
SHOULD  BE  RESPONSIBLE  FOR  OPERATING  WHAT  I  RECOMMEND,  WOULD 
I  MAKE  THE  SAME  RECOMMENDATION  OR,  AT  LEAST,  INVESTIGATE 
FURTHER?* 

Secondly,  although  there  is  widespread  use  of  the  data, 

THE  FUTURE  IS  STILL  WIDE  OPEN  FOR  INCREASED  USE  OF  THE  DATA, 
ESPECIALLY  IN  THOSE  AREAS  WHEREIN  PERSONNEL  MANAGEMENT  HAS 


PROBLEMS. 


s  PROGRAM 


TINUING  PROGRAM  IS  NOT  IN  BEING. 


PRODUCTIVITY  CURVE 


Accumulated  %  of  Ratings  entered  In  the  Data  Bank 
Accumulated  %  of  Job  Descriptions  entered 


JAN  JAN  JAN  JAN  JAN  JAN  JAN  SEP 

1971  1972  1973  1974  1975  1976  1977  1977 
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STATUS  OF  THE  NOTAP  DATA  BAfoX 
as  of:  1  Sep  1977 


Calendar 

Number  of  Ratings 

Percent  of  Mings 

Ml 

Entered  by  Year 

Altered  tt.Yttt 

m 

1 

2% 

1972 

2 

3% 

1973 

1 

13% 

1974 

7 

11% 

1975 

11 

18% 

1976 

20 

32% 

Thru  1  Sep  77 

J1 

JULi 

62 

100% 

70  Ratings  in  the  Navy 

6  Small  Ratings  plus  Hospital  Corpsman  and  Dental  Technician 
are  scheduled  or  in  process 
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JOB  DESCRIPTION 
STATUS  OF  THE  NOTAP  DATA  BANK 


as  Of:  1  St*p  1977 

Calendar 

Year 

Number  of 

Job  Descriptions 
Entered  Each  Year 

Percent  of  . 
Job  Descriptions 
Entered  Each  Year 

1971 

727 

1% 

1972 

2,196 

2% 

1973 

10,970 

12% 

1974 

12,808 

14% 

1975 

23,108 

25% 

1976 

32,341 

36% 

Thru  1  Sep  77 

9.223 

19% 

Total  91,373 

100% 
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STANDARD  PACKAGE  OF  COMPUTER  PRINTOUTS 
FOR  THE  TRAINING  COMMUNITY 


Task  Inventory  Buokle*  A  Optical  Scan  Booklet  glus 


program 

DESCRIPTION 

NECSPG  - 

* 

NEC  Listing  by  Paygrade 

ACTCOD  - 

Activity  Code  Listing 

PRTDIC  - 

Print  Dictionary  (Background  Variables) 

TITLES  - 

Titles  Listing  (Duty  &  Task) 

DIAGRM  &  GRPMBR  -  Diagram  (Time)  &  Group  Membership 

PRTVAR  - 

Print  Variable  (Background  data ) 

DUVARS  * 

Duty  Variable  (%  of  time  spent  by  each 
member) 

JOBDEC  - 

Job  Descriptions  (Paygrade,  Skill  Level, 
Primary  Diagram  Stages,  A  Skill 

Levels  within  Stages,  &  Total) 

GRPSUM  - 

Summaries  by  %  of  Members  Performing 
Each  Task  A  %of  Time  Spent  by 

All  Members 

ASFACT  - 

Levels  of  Task  Performance 

AVAIUE  - 

Average  Paygrade  Performing  Tasks 

VARSUM  - 

Variable  Summary,  (Worker  Characteristics, 
Job  Satisfaction,  Watch  Duties,  Collateral 
Duties,  Equipment  Items,  Etc.) 
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FUTURE  USES  OF  TASK  ANALYSIS  OATA 


1.  OCCUPATIONAL  HEALTH  AND  SAFETY 


2.  LEGAL  CASES 

A.  SELECTION  PROCEDURES 

B.  JOB  (WORKER)  REQUIREMENTS 


•  #  « 


1.  PROFESSIONALLY  DEVELOPED  ? 

2.  SKILL  RELATED  ? 

3.  CURRENCY? 

4.  ESSENTIAL  -  SAFETY  -  EFFICIENCY  -  WRALE  ? 

5.  REASONABLE -- BUSINESS  RELATED  PURPOSE? 


} 
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AN  ALTERNATE  COMPUTER  APPROACH  TO  THE 
ANAL /SIS  OF  OCCUPATIONAL  TASK-FACTOR  DATA 


Richard  W.  Dickinson 
Occupational  Research  Program 
Industrial  Engineering  Department 
Texas  ASM  University 


INTRODUCTION 

The  assumption  is  being  made  that  the  reader  is  at  least  vaguely 
familiar  with  the  CODAP  analysis  system.  While  this  paper  is  not 
Intended  to  be  a  technical  exposition  on  the  manner  in  which  the 
functions  of  the  CODAP  task-factor  programs  in  question  were 
duplicated,  it  is  felt  that  some  knowledge  of  CODAP  is  necessary 
to  place  the  information  presented  here  in  the  proper  perspective. 

Approximately  a  year  and  a  half  ago,  the  Occupational  Research 
Program  (ORP)  at  Texas  AMi  Uni vers *ty  was  faced  with  the  need  to 
analyze  and  manipulate  task-related  information.  Normally,  this 
need  would  have  been  satisfied  using  the  following  task-oriented 
CODAP  system  programs: 

COHGEN 

FACSUM 

FACSPC 

FACSTD 

PREFAC 

TSKCAT 

TSKCOR 

TSKFAC 

Unfortunately,  the  export  version  of  the  CODAP  source  code* 
does  not  contain  these  programs.  This  state  of  affairs  lead  ORP 
to  search  for  other  weons  of  handling  such  data.  Eventually, 
methods  were  developed  f«r  duplicating  the  functions  of  these 
eight  programs  through  the  use  of  SAS  (Statistical  Analysis 
System) , 

Description  of  SAS 

SAS  is  a  proprietary  product  of  SAS  Institute,  Inc.,  a  private 
company  devoted  to  the  maintenance  and  development  of  SAS.  It 
is  an  integrated  system  of  data  management  and  statistical  analysis 
conceived  for  use  on  I.B.M.  or  related  equipment.  This  PL/l-like 
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language  combines  statistical  versatility  with  extensive  capability 
for  data  manipulation  and  report  writing. 

SAS  can  read  data  in  almost  any  kind  of  format  from  any  kind 
of  file,  for  data  transformations,  SAS  offers  a  complete  lib'ary 
of  nathmatical  and  statistical  functions.  The  user  can  create  new 
variables,  delete  observations  and  variables,  accumulate  totals, 
and  execute  statements  conditionally.  File-handling  tools  available 
in  SAS  include  merging,  sorting,  interleaving,  concatenating, 
subsetting,  updating,  and  interactive  editing  of  data  sets.  One 
of  the  most  useful  aspects  of  SAS  is  its  ability  to  produce  output 
from  procedures  in  the  form  of  SAS  data  sets.  For  example,  SAS 
can  put  predicted  values  from  a  regression  analysis  in  a  data  set 
for  further  analysis  or  manipulation. 

Description  of  Task-Factor  Programs 

Below  is  a  listing  of  the  task-factor  programs  of  interest  alcng 
with  a  short  desert-  ion  of  their  function.  These  eight  programs 
were  written  to  allow  greater  flexabllity  in  the  manipulation, 
analysis  and  reporting  of  task-oriented  data,  which,  up  until  this 
time,  was  innccessnb.le  to  existing  CODAP  programs. 

COMGEN  -  Using  user  supplied  program  stateoents,  this  program 
performs  mathmutical  operations  on  factor  data  files. 

FACSUM  -  Prints  final  report  in  specified  sort  format.  This 
program  can  also  create  new  files  representing 
differences  between  various  existing  files. 

FACSPC  ~  Closely  parallels  program  JOBDEC  except  thst  it 

applies  to  factor  ratings  instead  of  job  descrip¬ 
tions. 

FACSTD  -  Closely  parallels  program  INPSTD  except  that  it 
creates  a  rater  history  file  instead  of  a  worker 
history  file. 

PREFAC  -  Based  on  regression  weights  this  program  produces 
a  deck  of  predicted  score  values. 

TSKCAT  -  Produces  a  card  deck  with  "IV  or  "O's"  representing 
specified  tasks. 

TSKCOR  -  Computes  intercorrelation  matrices  and  user-speci¬ 
fied  regression  problems. 

TSKFAC  -  This  program  is  used  to  create  new  records  on  the 
job  description  file  to  be  later  referenced  by 
othei  task-factor  programs  in  the  CODAP  system. 
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Statement  of  the  Problem 

That  task-oriented  information  can  now  be  accessed  similarly  to 
that  of  information  regarding  jobs  or  persons  adds  significantly  to  | 

the  tools  the  job  analyst  can  bring  to  bear  on  occupational  problem  I 

areas.  Unfortunately,  the  source  code  in  which  the  eight  task-factor  1 

programs  were  written  has  not  been  converted  for  use  on  non-Univac  1 

equipment.  Such  a  situation  effectively  isolates  users  of  CODAP  with  | 

non-Univac  equipment  from  benefiting  from  the  convenience  and  flex-  i 

ibility  these  programs  offer  in  the  handling  of  task  data.  It  was  | 

for  this  reason  that  ORP  turned  Co  8AS.  J 

METHOD  | 

Tlit*  final  output  form  of  the  report  presented  in  Table  1  was 
produced  in  a  single  job  run.  All  necessary  data  for  the  report 
were  input  via  the  card  reader,  although  it  woulo  have  been  possible 
to  access  tape  or  disc  devices.  The  following  is  a  short,  highly 
simplified  description  of  trie  means  by  which  SAS  duplicated  the 
functions  of  the  task-factor  programs. 

As  outlined  in  Figure  1  the  data  stream  initially  consists 
of  task  statements,  task-factor  data,  and  job  survey  data.  The  task- 
factor  data  consists  of  moan  tank  values  in  punch  card  format  pro¬ 
duced  by  program  REXA1.L.  Job  strvey  data  is  in  the  form  of  punched 
output  from  program  AVAI.UE  and  JOBDFC.  Normally,  JOBDEC  produces  no 
punched  output  ocher  than  a  "JDC"  card  in  which  to  reference  the  Job 
Description  File  (to  access  data  such  as  percent  performing).  For 
our  purposes,  a  roodif 'cation  of  JOBDF-C  w as  introduced  allowing 
punched  output  of  any  values  created  by  the  program.  Background  data 
on  raterr.  could  have  also  been  input  had  there  been  a  deairc  to 
later  reference  data  based  on  certain  specified  rater  history 
characteristics.  As  the  data  is  input,  SAS  automatically  creates 
temporary  files  to  be  referenced  for  use  later  In  the  program. 

The  files  of  interest  are  merged  (bv  tank  statement)  eventing  the 
master  data  set  upon  which  various  statistical  routines  con  be 
performed . 

At  this  point,  SAS  has  performed  in  a  manner  analogous  to  the 
following  CODAP  system  programs:  TSKFAC,  FACST1),  FACSPC.  Specific 
regression  analyses  can  be  performed  on  the  merged  data  using  SAS's 
sophisticated  General  Mnenr  Models  procedure  along  with  the  request 
that  predicted  values  be  output  to  the  master  data  set  (essentially 
the  role  of  TSKCOR  and  PRLFAC) .  Prior  to  being  analyzed,  data  may  be 
modified  by  SAS  (squared,  cubed,  etc.)  limply  through  program  state¬ 
ments  at  any  point  in  the  data  stream  (COMGEN's  function).  Last, 
the  final  report  may  be  printed  in  any  format  or  sort  order  desired 
(FACSUM's  function).  TSKCAT's  function  could  easily  be  duplicated 
either  through  program  generated  "l'a"  and  "0's"  for  specific  tasks 
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Figure  1.  Hpothetical  Schematic  of  SAS  Input  Stream 
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Data  Hay  Be  Manipulated.,  Transformed,  or 
Deleted  At  Any  Time  Along  The  Input  Stream. 


were  no  definitions  of  criticality  to  restrict  the  dimensions  along 
which  raters'  judgments  were  to  be  made.  Therefore,  we  decided  t*>  try 
a  rating  technique  that  would: 

1.  Enable  raters  to  compare  tasks  to  one  another 
rather  than  tc  a  numerical  scale. 

2.  Simplify  the  judgment  process. 

3.  Provide  an  operational  definition  of  criticality. 

The  paired-comparison  technique,  although  used  to  scale  a  variety  of 
kinds  of  stimuli  from  things  to  people,  has  not  been  used,  to  our 
knowledge,  In  rating  job  tasks.  In  the  paired-comparison  approach, 
raters  ara  presented  twn  stimuli  and  awhed  to  judge  which  stimulus  la 
graater  with  respect  ro  some  characteristic  r  ch  as  sire,  brightness, 
beauty. 

We  have  tried  the  palred-comp^riaon  technique  in  two  studies 
(Boldovici,  «t  al . ,  1976;  Roldovlci,  ejt  ai ,  1977).  In  tha  first  project, 
two-hundred  forty  tank  gunnery  tasks  were  ranked  iu  terms  of  criticality, 
which  was  determined  by  the  use  of  the  palred-compalaon  technique.  The 
Tank  Commanders  serving  aa  respondents  ware  presented  with  auiny  pairs 
of  target/range  combinations.  (An  example  of  a  pair  of  target/range 
combinations  Is  tank  at  2000  to  2500  metara,  and  light-armored  vehicle  at 
500  to  1000  meters. j  The  respondents  were  instructed  to  assume  that  they 
had  encountered  each  pair  of  target/range  combination  on  the  battlefield, 
and  that  they  could  nor  engage  the  targets  simultaneously.  They  were  then 
asked  to  indicate  which  one  of  the  two  target/range  combinations  that 
comprised  each  item  they  would  engage  first.  A  criticality  score  was  com¬ 
puted  by  counting  the  number  of  times  each  combination  waa  chosen  as  more 
threatening  ('Vould  be  engaged  first")  and  dividing  by  the  number  of 
times  It  could  have  been  choaem  (Guilford,  1954).  Inter-rater  reliability 
was  in  the  high  nineties.  Since  the  rated  items  varied  only  in  target 
type  and  range,  the  Judgments  abouv  target  threat  or  criticality  were 
easy  to  make.  The  high  degree  of  rater  agreement  probably  also  reflected 
certsin  learning  experiences  tint  the  subjects  had  in  common:  Tank 
Commanders  receive  formal  training  in  assessing  target  threat.  The  high 
inter-rater  reliability,  therefore,  may  simply  have  indicated  that  all 
of  the  subjects  had  learned  "the  same  things."  The  second  project  pro¬ 
vided  for  answering  toe  question  whether  sir.ilar'y  high  inter-rater 
reliability  could  be  achieved  using  the  palved-comparisoft  technique  with 
a  leas  homogenous  sample  of  armor  tasks,  where  the  dimensions  for  auking 
the  criticality  judgments  were  less  obvious  than  target  or  rtnge,  aud 
where  the  respondents  had  not  received  formal  instruction  in  making  judg¬ 
ments  of  the  kind  required  for  the  ratings. 
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A  PAIRED-COMPARISON  APPROACH 
FOR  ESTIMATING  TASK  CRITICALITY 


Training  resource  limitations  demand  that  choices  be  made  about 
what  to  include  in  training,  and  what  to  exclude  Agreement  seems 
widespread  that  training  programs  should  minimally  include  tasks  that 
are  critical  to  effective  job  performance  and  cannot  be  performed  by 
new  trainees.  In  military  training  contexts,  this  reduces  to  including 
in  training  those  tasks  that  are  critical  to  effective  performance  in 
ccjbat.  Since  combat  cannot  be  realistically  simulated,  a  measurement 
problem  immediately  arises;  namely,  how  to  measure  task  criticality. 

Prescriptive  training  development  literature  typically  mentions 
task  criticality  as  an  important  consideration  in  determining  training 
content.  The  literature  is,  however,  vague  on  the  question  of  how  to 
measure  criticality,  and  silent  on  the  measurement  issues  associated 
with  criticality  estimation. 

Conventional  training  development  methods  deal  with  the  problem 
of  selecting  tasks  for  inclusion  in  training  in  the  following  way: 

A  job  analysis  is  conducted,  resulting  in  a  task  list  or  "inventory." 
Expert  judgment  is  then  used  to  rate  the  criticality  of  each  task, 
usually  on  some  n-point  scale  ranging  from  "irrelevant  to  the  job"  to 
"highly  critical  to  session  accomplishment."  The  tasks  receiving  the 
highest  ratings  are  selected  for  inclusion  in  training,  and  those  receiv¬ 
ing  low  criticality  ratings  are  excluded  or  deemphasixed.  Since  the 
content  of  training  frequently  is  determined  on  the  basis  of  criticality 
ratings,  a  question  arises  as  to  how  much  confidence  can  be  placed  in 
the  ratings.  One  index  of  that  confidence  is  inter-rater  reliability: 
to  the  extent  that  several  raters  independently  produce  similar  criti¬ 
cality  ratings,  confidence  in  the  job-relevance  of  training  content  based 
on  the  ratings  increases.  The  test-development  axiom  is  directly  analo¬ 
gous:  reliability  is  necessary  for  validity.  Applied  to  training  content 
the  axiom  becomes  "reliability  (of  criticality  ratings)  is  uecessary  for 
job-relevance  .of  training  content)." 

The  reliability  of  criticality  ratings  that  are  used  for  determining 
training  content  seldom  is  reported  (McCluskey,  et  al. ,  1975;  NcKnight 
and  Hundt,  1972).  In  the  few  instances  where  reliability  has  been 
reported  '.‘aspersuin  and  Pratsner,  1975)  rater  agreement  has  been  poor — 
too  low  in  the  ratings  to  be  of  practical  use.  We  suspected 

that  low-relieM U 1/  in  these  studies  was  due  to  two  important  factors. 
First,  iasKo  wetv  iVcv-'d  on  an  absolute  rather  than  comparative  basis  which 
among  other  things,  tends  to  restrict  the  range  of  ratings.  Second,  there 


were  no  definitions  of  criticality  to  restrict  the  dimensions  along 
which  raters'  judgments  were  to  be  mads.  Therefore,  we  decided  t*>  try 
a  rntlng  technique  that  would: 

1.  Enable  raters  to  compare  tasks  to  one  another 
rather  than  tc  a  numerical  scale. 

2.  Simplify  the  judgment  process. 

3.  Provide  an  operational  definition  of  criticality. 

The  paired-comparison  technique,  although  used  to  scale  a  variety  of 
kinds  of  stimuli  from  things  to  people,  has  not  been  used,  to  our 
knowledge,  in  rating  job  tasks.  In  the  paired-comparison  approach, 
raters  are  presented  twr>  stimuli  and  sated  to  judge  which  stimulus  Is 
greater  with  respect  to  some  characteristic  r  ch  as  sixe,  brightness, 
beauty. 

We  have  tried  the  paired-comparison  technique  in  two  studies 
(Boldovici,  et  al..  1976;  Koldovlcl,  et  si,  1977).  In  the  first  project, 
two-hundred  forty  tank  gunnery  tasks  were  ranked  lit  terms  of  criticality, 
which  was  determined  by  the  use  of  the  paired-compaison  technique.  The 
Tank  Commanders  serving  as  respondents  were  presented  with  suiny  pairs 
of  target/range  combinations.  (An  example  of  a  pair  of  target/range 
combinations  is  tank  at  2000  to  2500  meters,  and  light-armored  vehicle  et 
300  ;o  1000  meters. y  The  respondents  were  instructed  to  assume  that  they 
had  encountered  each  pair  of  target/range  combination  on  the  battlefield, 
end  that  they  could  not  engage  the  targets  simultaneously.  They  were  then 
asked  to  indicate  which  ono  of  the  two  terget/renge  combination*  that 
comprised  each  item  they  would  engage  first.  A  criticality  score  wee  com¬ 
puted  by  counting  the  number  of  times  each  combination  was  chosen  as  more 
threatening  ("»?oulc:  be  engaged  first")  and  dividing  by  the  number  of 
times  it  could  have  been  choaea  (Cuilford,  1954).  Inter-rater  reliability 
was  in  the  high  nineties.  Since  the  rated  items  varied  only  in  target 
type  end  range,  the  judgments  about  target  threat  or  criticality  were 
easy  to  make.  The  high  degree  of  rater  agreement  probably  also  reflected 
certain  learning  experiences  that  the  subjects  had  in  common:  Tank 
Commanders  receive  formal  training  in  assessing  target  threat.  The  high 
inter-rater  reliability,  therefore,  may  simply  have  Indicated  that  all 
of  the  subjects  had  learned  "the  same  things."  The  second  project  pro¬ 
vided  for  answering  the  question  whether  sliiler'y  high  inter-rater 
reliability  could  be  achieved  using  the  paired -comparison  technique  with 
a  less  homogenous  sample  of  armor  tasks,  where  the  dimensions  for  sulking 
the  criticality  judgments  were  less  obvious  then  target  or  r^nge,  aud 
where  the  respondents  had  not  received  formal  instruction  in  making  judg¬ 
ments  of  the  kind  required  for  the  ratings. 
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Fovty-eighf.  captains,  who  were  enrolled  In  the  Armor  Officers’ 
Advanced  Course  (AOAC)  at  Fort  Knox  during  the  conduct  of  the  project * 
served  as  respondents.  Twelve  forms  of  a  paired-comparison  question¬ 
naire  vere  used.  The  stimuli  to  be  rated  in  each  form  were  the  tasks 
for  one  of  four  crew  positions  (Driver,  Loader,  Gunner,  Tank  Commander) 
in  one  of  three  tanks  (H60A1,  M48A5.  M6QA3).  The  design  of  each  form 
of  the  questionnaire  cart  be  illustrated  by  describing  how  the  form  for 
the  M60A1  Driver  tasks  was  designed.  Seventy  M60A1  Driver  tasks  vere 
Identified  during  the  task-description  part  of  the  project.  The  number 
of  possible  different  pairs  of  70  casks,  then,  is  70  x  69/2  -  2415. 

This,  of  course,  would  have  been  too  many  judgments  for  each  respondent 
to  make.  A  partial  paired-comparison  design  (KcCormlck  and  Bachus, 

1952)  was  used,  in  which  each  of  the  70  tasko  was  paired  with  each  of 
seven  other  tasks.  This  partial  pairing  approach  yielded  245  unique 
pairs  of  tasks  for  the  M60A1  Driver.  The  numbers  of  pairs  of  tasks  for 
the  other  11  forms  of  the  questionnaire  ranged  from  135  to  280. 

The  reapondents  were  instructed  to  assume  that  they  were  company 
commanders  choosing  crew  members  to  take  on  a  mission  in  which  fire  would 
be  exchanged  with  the  enemy.  They  vers  then  asked  to  Indicate  which  of 
two  crew  members  they  would  choose,  based  on  whether  the  crew  member 
could  do  one  or  the  other  of  a  pair  of  tasks.  An  example  of  a  pair  of 
tasks  for  the  M60A1  Driver  is: 

1.  Start  tank  engine. 

2.  Hove  vehicle  into  defilade  firing  position  upon 
enemy  contact. 

Criticality  valuta  were  calculated  for  each  of  tha  twelve  aeta  of 
tasks  by  a  rtandard  three-step  procedure  (Guilford,  1954)  which  placed 
the  twelve  ante  of  values  on  a  similar  positive  scale.  Inter-rster 
reliability  was  estiauited  by  correlating  scale  values  for  tasks  comaxm 
to  the  three  tanka.  The  correlations  ranged  from  .55  to  .79,  with  an 
average  of  .68,  All  were  statistically  significant  (p  <  .05). 

The  paired-comparison  technique  holds  promise  es  an  approach  for 
estimating  the  relative  criticality  of  tasks.  However,  the  inter-rater 
reliability  vitLmmimi  and  questions  about  the  validity  oi  the  results 
obts:  »d  in  the  two  projects  raise  separate  issues  for  discussion 
regarding  how  to  genarate  task  criticality  estimates  that  are  reliable 
and  valid. 


Reliability 

The  reliability  of  the  criticality  estimates  obtained  in  the  second 
paired-comoariaon  study,  though  statistically  significant  and  prdbably 
greater  than  the  reliabilities  of  criticality  ratings  in  studies  using 
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absolute  ratings  (Harris,  et  al. ,  1975),  seen*  only  marginally 
acceptable,  particularly  when  compared  to  the  results  of  the  first 
paired-comparison  project.  The  earlier  project,  however,  differed 
frcs  th£  later  one  in  several  respects  which  give  rise  tc  seas  tenta¬ 
tive  operating  assumptions  on  how  to  generate  criticality  estimates 
that  are  highly  reliable.  The  reliability  of  the  criticality  ratings 
can  bu  expected  to  Increase  with: 

1.  Specificity  of  the  distensions  alo 


criticality  ratings  are  to  be  made.  To  the 


extent  that  investigators  can  create  a  uniform 
aet  among  raters  as  to  tha  dimensions  along 
which  judgments  are  to  be  made,  rater  agree¬ 
ment  should  lncreaaa.  Without  clear  specification 
of  the  dimensions  for  making  judgments,  raters 
will  "make  up"  their  own  dimensions.  And  if 
these  dimensions  differ  from  one  rater  to  the 
next,  rater  agreement  will  suffer. 

2 .  Common  learning  experiences  among  raters.  The 
obvious  recommendation — that  raters  should 
practice  making  judgments  of  the  kind  required 
by  the  criticality  study — la  warranted  only 
when  the  condition  just  discussed  (specific 
distensions)  is  met.  Practice  might  otherwise 
simply  reinforce  idiosyncratic  rater  behavior 
and  thus  reduce  rater  agreement. 

3.  The  extent  to  which  complete  pairings  of  the 


tasks  to  be  rated  is  approximated.  The  desira¬ 
bility  of  eliminating  the  ’’luck  of  the  draw" 
in  determining  which  tasks  get  paired  with  one 
another  must,  however,  be  traded  off  against 
the  heavy  rater  workloads  that  characterize 
complete  pairings  with  large  numbers  of  stimulus 
materials. 

4 .  The  number  of  times  each  stimulus  ic  rated.  Every 


respondent  need  not  rate  every  possible  pair  of 
tasks,  though  this  may  be  desirable.  Decreasing 
the  workload  of  each  subject  can  be  accomplished 
in  several  ways.  Partial  pairings  can  be  used, 
with  all  subjects  rating  all  pairs,  Or  complete 
pairings  can  be  used  with  some  of  the  subjects 
rating  some  pairs  and  not  others.  Various  mixes 
of  the  approaches  also  may  be  used — partial  pairings 
with  some  subjects  rating  some  pairs  and  not  others. 
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The  optimal  compromises  ere  unfortunately ,  not 
known.  Examinations  would  be  Interesting,  of 
the  effects  on  rater  agreement  of  various  reduc¬ 
tions  (combined  and  in  isolation)  in  number  or 
proportion  of  coapared  pairs,  uuabe:  or  proportion 
of  raters  rating  each  pair,  and  number  of  observa¬ 
tions  par  stimulus  and  pair.  The  generality  of 
the  results  of  such  research  would,  of  course, 
never  be  fully  established.  Questions  would  always 
resain  about  the  effects  of  stiaulus  materials, 
instructions  to  racers,  rater  experience  and  so 
forth,  on  the  results  obtained.  But  if  confidence 
is  desired  in  the  results  of  studies  that  purport 
to  aeaaura  the  criticality  of  combat  tasks,  than 
additional  research  on  factors  affecting  rater 
reliability  seems  necessary. 


Validity 

Any  study  which  clctms  to  measure  task  criticality  raises  questions 
associated  with  the  construct,  content,  and  predlctlva  validity  of  the 
reoultr.  obtained.  Ccnatru*  t  validity  is  concerned  with  the  extent  to 
which  one  measured  whet  one  Intended  to  measure.  Instructions  to  the 
respondents  should  be  designed  to  create  a  ^*t  fo-  judging  criticality 
and  criticality  alone.  But  raters’  judgments  may  be  influenced  by 
extraneous  considerations  such  as  how  difficult  a  task  is  to  learn  or 
perform,  or  how  frequently  it  is  performed  on  the  job.  Questious 
about  construct  validity  will  remain  as  long  i«  reasonable  counterinter¬ 
pretations  of  the  results  can  be  advanced  (Cronbach,  1976). 

Content  validity  addressee  the  extent  to  which  Items  used  in 
questionnaires  represent  the  universe  of  items.  The  issue  of  how  well 
thu  universe  of  subject  matter  is  sampled  can  never  be  fully  resolved. 
Resolution  would  require  widespread  agreement  on  the  adequacy  of  the 
descriptors  used  to  define  the  universe,  and  on  precise  definitions  of 
what  constitutes  adequate  sampling.  On  the  other  hand,  if  a  job  domain 
la  carefully  partitioned  into  tasks,  and  all  tasks  are  included  In  the 
criticality  study,  content  validity  is  not  a  major  concern. 

Predictive  validity  is  concerned  with  to  what  extent  would  the 
criticality  scores  or  predictions  made  from  them,  correlate  with  a  direct 
measure  of  criticality.  Establishing  the  predicitve  validity  of  the 
results  of  a  criticality  study  would  requira  correlating  the  obtained 
criticality  scores  with  a  direct  measure  of  criticality.  Obtaining 
direct  measures  of  task  criticality  in  combat  is,  of  course,  out  of  the 
question.  Intermediate  criteria,  combat  simulations,  for  example,  might 
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be  used  in  studies  of  predictive  validity.  Of  course,  achieving 
adequate  measurement  reliability  under  simulates  combat  conditions 
would  be  very  expensive,  though  absolutely  essential  if  any  important 
decisions  are  to  be  made  based  on  the  simulation  results. 

Concern  vlth  the  validity  of  the  ratings,  though  appropriate,  may 
be  premature.  Reliability  issues  associated  vlth  estimating  the  criti¬ 
cality  of  job  casks  have  only  begun  to  be  raised.  Given  a)  that 
nothing  is  known  about  the  validity  of  criticality  estimation,  and 
b)  choices  between  results  of  known  and  unknown  xellablllty,  training 
developers  would  seem  well  advised  to  use  results  whose  reliability 
is  known.  In  this  respect,  it  appears  that  the  paired-comparison 
technique  holds  promise  as  a  method  of  rating  task  criticality. 
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THE  HIERARCHICAL  CLUSTERING  OF  VARIABLES 


A  PRAGMATIC  VIEW 

CAPT  BARRY  P.  MCFARLAND  ASD/ENECH 
ABSTRACT 

The  purpose  of  this  study  was  to  evaluate  the  use  of  cluster  analysis 
techniques  to  define  task  clusters  as  a  tool  in  the  Air  Force's  Occupational 
Analysis  Program.  Two  Air  Force  job  analysis  surveys  were  used  for  the  task 
clustering  and  the  invariance  of  the  test  clusters  were  cotspared  between  the 
two  inventories.  The  clusters  identified  were  generally  independent  and 
homogeneous.  The  clusters  also  showed  in  a  high  degree  of  reliability 
when  the  two  separate  inventories  were  compared.  A  number  of  implications 
and  potential  applications  for  this  type  of  analysis  and  results  are 
discussed. 
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I.  INTRODUCTION 


The  purpose  of  this  study  was  to  evaluate  task  (variable)  clusters 
as  a  technique  and  tool  that  sight  be  applicable  to  a  wide  range  of  Air 
Force  personnel  and  managewsnt  problems  on  which  raw  data  from  the  Air  Force 
Occupational  Analysis  Program  already  exists.  The  approach  was  to  evaluate 
task  clusters  in  terms  of  comnon' characteristics  of  tasks  and  to  replicate 
the  clustering  and  evaluate  the  invariance  of  the  clusters  identified  to  a 
second  Independent  set  of  task  clusters. 

CLUSTER  ANALYSIS; 

Cluster  analysis  reduced  to  Its  simplest  definition  is  the  grouping  of 
like  objects  into  identifiable  groups.  As  such,  cluster  analysis  has 
conceptual  roots  that  are  as  old  as  man  himself.  Since  assessment  of 
similarities  and  difference  among  entities  is  a  universal  problem,  cluster 
analysis  techniques  have  been  developed  end  utilized  in  virtually  all  areas 
of  human  thought  and  scientific  endeavor.  The  modern  scientific  roots  of 
cluster  analysis  can  be  traced  to  Linnaeus  who  first  developed  a  formal 
taxonomy  or  genotyping  based  on  the  grouping  of  similar  characteristics. 

The  mathematical  roots  in  the  measurement  of  similarity  is  much  more  recent 
with  origins  based  upon  the  work  of  Pearson  (1901)  and  Spearman  (1904). 

It  was  not,  however,  until  the  advent  of  the  high  speed  computer  that 
mathematical  cluster  analysis  became  a  tool  of  general  applicability  to 
scientific  study. 

The  primary  Impact  of  cluster  analysis  has  been  in  what  Tryon  et  al 
(1970)  call  "o-analysis,"  or  the  clustering  of  objects.  The  objects 
clustered  may  be  individual  job  descriptions,  cells  or  genotypes.  In 
biology  this  type  of  analysis  has  come  to  be  called  numerical  taxonomy 
(Sokal  &  Sneath,  1963;  Jardine  &  Sibson,  ’971).  By  factor  analysts 
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this  type  of  analysis  has  been  called  inverse  factor  analysis  or  in  the 
more  simplified  version  the  "Q-technique"  (Burt,  1937;  Cartel,  1952: 
Stephenson,  1963).  As  such  it  has  been  relegated  tc  a  minor  role  in  factor 
analysis  and  has  even  been  completely  omitted  by  some  authors  in  the  area, 
(e.g. ,  Harmon,  1967).  Thus  being  omitted,  aauy  psychologists,  while  having 
a  working  familiarity  with  factor  analysis,  are  completely  naive  of  any 
modem  cluster  analysis  techniques.  In  addition  to  psychologists'  and 
biologists'  uses  of  cluster  analysis,  other  disciplines  including  electrical 
and  mechanical  engineering  have  found  utility  in  object  analysis  techniques 
for  problems  in  pattern  recognition. 

in  contrast  to  object  clustering,  very  little  attention  has  been  given  to 
variable  clustering.  Variable  clustering  has  long  been  the  exclusive  domain 
of  factor  analysis, except  for  the  data  reported  by  Tryon  &  Bailey  (1970). 

There  are  a  number  of  reasons  why  more  attention  needs  to  be  given  cluster 
analysis  of  variables.  First,  it  is  not  always  the  goal  of  the  analysis  to 
explain  the  common  variance  in  terms  of  the  minimum  number  of  constructs. 

The  cluster  analysis,  particularly  the  hierarchical  clustering  techniques, 
permits  the  researcher  the  flexibility  of  identifying  any  number  of  clusters 
based  on  a  combination  of  statistical  homogeneity  of  the  clusters  and 
empirical  meaningfulneas.  In  contrast,  factor  analysis  has  as  its  most 
serious  weakness  the  need  for  rotation  of  axea.  Since  the  goal  of  factor 
rotations  is  to  identify  factors  in  terms  of  *’ apparently"  common  psychological 
constructs  it  is  interesting  to  note  that  the  most  popular  factor  rotation 
technique,  Variraax  (Kaiser  1958)  is  sadly  lacking  in  its  ability  to  provide 
sound  psychological  constructs  ^hat  have  stability  and  consistency.  Although 
Variraax  does  provide  an  optimal  solution  (i.e.,  maximum  common  variance  with 
minimum  number  of  factors),  the  solutions  are  frequently  of  little  utility 


in  providing  the  researcher  with  a  better  understanding  of  the  data.  This 
is  as  much  a  criticise  of  our  current  psychological  constructs  as  a  criticise 
of  Varimax  and  factor  analysis  in  general.  Cluster  analysis  however  does  not 
have  the  aforementioned  analysis  problems. 

Alii  FORCE  OCCUPATIONAL  ANALYSIS  PROGRAM; 

The  Air  Force  hat  used  cluster  analysis  techniques  as  a  part  of  the 
occupational  analysis  program  sines  1958.  The  program  uses  job  inventories  ■ 
for  the  collection  of  quantitative  data  obtained  directly*  from  job  incumbeuta 
who  describe  their  job  within  an  Air  Force  specialty  area.  In  completing  the 
job  inventory,  each  job  incumbent  supplies  identification  and  background  data 
on  himself  and  then  checks  an  extensive  listing  of  job  tasks  that  are  inclusive 
of  all  tasks  performed  within  the  specialty.  In  addition  to  identifying  each 
task  performed,  the  iucumbent  rates  all  tasks  he  performs  on  a  7-polnt  scale 
indicating  the  relative  amount  of  time  spent  on  each  task  compared  to  all  other 
tasks  performed.  The  ratings  range  from  1  (very  much  below  average)  to  7  (very 
much  above  average)  with  4  being  a  mid-point  (about  average). 

The  techniques  for  developing  the  job  inventory  and  occupational  analysis 
procedures  are  reported  in  a  series  of  research  reports  dating  back  to  1958. 

For  the  best  summary  of  general  procedures  for  the  construction  of  job  inven¬ 
tories,  see  Morsh  and  Archer  (1967).  The  past  research  and  continuing 
experience  with  job  inventory  data  Indicates  that  these  clustering  techniques 
produce  highly  reliable  information  about  existing  Air  Force  jobs. 

The  technology  that  currently  exists  in  the  analysis  of  job  invsntory 
responses  allows  comparison  of  job  types  obtained  through  the  cluster  analysis 
of  individuals  (objects).  The  tasks  or  variables  are  not  routinely  clustered 
into  independent  groupings.  Since  the  total  number  of  tasks  for 
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any  given  specialty  may  be  extremely  large,  it  would  be  advantageous  to 
have  objective  and  empirically  defined  task  grouping*.  Such  groups  would 
provide  the  occupational  analyst  with  task  summary  information  that  would 
benefit  the  analysis  of  a  specialty  code.  Also  such  information  could  be 

used  to  identify  possible  redundant  items  in  the  inventory.  As  a  data 

> 

bank  of  tasks  is  developed,  the  relative  independence  and  redundancy  of 
items  would  be  available  for  future  inventory  development. 

Perhaps  most  important  is  the  potential  of  identifying  task  groupings 
with  tha  same  or  similar  rubrics  across  differsnt  specialty  areas.  Such 
groups  would  allow  direct  performance  comparisons  serosa  Air  Force  specialty 
codes  with  a  number  of  coraon  performance  measures.  The  possible  applica¬ 
tions  of  these  measures  are  numerous,  ranging  from  the  identification  of 
common  training  requirements,  to  grade  and  skill  level  authorizations,  to 
officer/airmen  meaning  requirements.  Each  of  these  is  of  critical  operational, 
as  well  as  research, interest  to  the  Air  Force. 

II.  METHODOLOGY 

The  nurse  job  inventory  contained  648  tasks  and  was  originally  completed 
by  «  total  of  2664  Air  Force  nurses.  Because  of  some  computer  programming 
limitations, a  sample  of  927  cases  were  randomly  selected  to  complete  the 
task  clustering  sample.  All  648  tasks  were  used  in  clustering  although  only 
575  of  these  were  common  to  Medical  Service  Job  Inventory. 

The  medical  service  job  inventory  contained  a  total  of  600  tasks  and 
was  originally  completed  by  a  total  of  2716  Air  Force  Medical  Service 
Cctpsoen.  A  sample  of  927  cases  were  randomly  selected  from  the  total 
sample  to  be  included  in  the  task  clustering  operation. 


In  the  normal  occupational  analysis  procedure,  job  typing  ip  performed 
by  clustering  individuals  (objects)  into  homogeneous  groups.  The  input 
matrix  appears  as; 

TASKS 

r 

i 

» 

i 

L 

Clustering  is  accomplished  by  combining  individuals  based  on  the 
similarity  of  the  percent  of  time  spent  on  tasks. 

To  complete  the  task  clustering  or  variable  cluttering  this  matrix 

was  rotated  so  that  the  following  input  was  used: 

OBJECTS 

r  i 

! 

i 

i 

i 

j 

The  clustering  procedure  u**d  was  the  same  as  that  reported  by  Christal 
and  Ward  (1967).  Thus  matrices  of  size  648  X  927  and  600  X  927  were  used 
respectively  for  the  Nurse  and  Medical  Sorvice  input  dsts. 

The  hierarchical  clustering  method  used  combines  the  two  &ost  common 
items  into  a  single  group  and  proceeds  in  an  iterative  fashion  to  combine 
one  group  at  a  time  until  only  one  group  remains.  Thus* for  the  nursing 
tasks, thare  were  647  iterations  before  reaching  a  group  size  of  l,and  599 
steps  for  the  medical  service  tasks.  For  the  analysis  a  percent  commonality 
was  used.  This  is  a  simple  transformation  of  absolute  difference.  The 
formula  for  the  procedure  used  is  given  as: 

100  X  2  rain  (XI j)  Xij  «  Percent  Commons  Li ty 
N  (OBJECTS) 
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Ncte.  cdvar  measures  could  be  easily  used  in  the  clustering  routine.  If, 
fcr  example.  Euclidean  distance  is  used,  similarities  between  the  clustering 
technique  and  multidimensional  scaling  become  readily  apparent.  For  this 
initial  effort,  the  aim;  lest  measure  was  employed  since  no  parametric 
statistics  were  to  be  employed  from  the  similarity  measures  and  since  previous 
research  has  shown  that  the  clusters  derived  are  empirically  sound 
(e.g«,  McFarland,  1974). 

111.  RESULTS  AND  DISCUSSION 

Figures  1  and  2  are  a  summary  of  the  major  clusters  identified  in  the 
hierarchicalgrouping.  The  titles  for  each  of  the  clusters  were  assigned 
based  on  tha  task  statements  end  previous  experience  during  job  analysis 
on  nurses  and  medical  service  personnel.  The  clusters  were  defined  as 
task  families.  Each  task  family  consists  of  a  set  of  tasks  which  are  more 
homogeneous  within  that  task  family  than  with  any  grouping  outside  the  task 
family.  Thus,  if  an  individual  performed  any  one  of  the  tasks  within  a 
task  family,  there  is  a  higher  probability  that  he  will  perform  another 
task  in  that  family  than  he  would  perform  e  taek  In  another  specific  task 
family.  The  tasks  comprising  each  task  family  are  independent  and  mutually 
exclusive  of  all  other  task  families. 

Each  task  In  each  family  was  taken  and  applied  to  the  job  analysis 
data  obtained  in  the  nurse  and  medical  service  job  analysis.  Cumulative 
time  Spent  values  were  computed  for  eech  taek  family  for  all  task  families 
identified  (Nurse  and  Medical  Service).  Intercorreletion  matrices  r f  the 
time  spert  data  ware  computed  for  sample  of  3,115  nurse  and  medical  service 
corpsrwnto  determine  the  commonality  between  th*  task  families.  The 
intercorrclatlon  natrices  are  shown  *«  Tables  1,  2  and  3. 
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Table  1  shows  the  intercor relation  among  the  nurse  task  families  based 
on  the  percent  time-spent  in  each  task  family  for  nurse  and  medical  service 
carps man.  Note  that  the  highest  correlation  is  .6747  between  task  famil  >s 
100  and  351.  As  shown  in  Figure  1,  these  two  task  families  came  together  at 
group  stage  90.  Other  than  th*s  one  relatively  high  correlation,  the  rest 
of  the  matrix  clearly  demonstrates  a  high  degree  of  independence  for  each  of 
the  task  families  identified.  The  relatively  high  correlations  found  with 
task  family  100  are  somewhat  inflated  because  of  the  large  number,  of  tasks 
contained  in  this  group.  Table  2  shows  similar  results  for  task  families 
identified  from  the  medical  service  job  inventory.  Two  high  correlations 
were  Identified.  These  were  between  Groups  349  and  211,  and  between  Groups 
94  and  211.  Groups  349  and  2.11,  as  shown  in  Figure  2,  came  together  to  form 
Group  89.  Thus,  the  high  correlation  between  these  two  task  families  was  in 
fact  expected  based  on  the  hierarchical  clustering  of  variables.  The  two 
task  families,  211  and  94,  are  highly  related  in  that  personnel  typically 
performing  one  set  of  tasks  also  perform  the  other.  It  can  be  argued  that 
these  two  task  families  should  have  grouped  together  in  the  cluster  analysis. 
The  reason  they  did  not  group  together  was  that  the  two  groups  did  not  exist 
at  the  same  stage  in  the  hiorcharical  clustering.  This  is  considered  a  major 
disadvantage  of  limitation  in  any  acretion  method  of  hierarchical  clustering. 
The  advantage  that  hierarchical  clustering  provides  is  that  although  the 
tasks  in  the  two  groups  are  independent  and  mutually  exclusive,  the  similarity 
between  the  time-spent  values  for  the  two  groups  was  readily  identified.  Thus, 
although  hierarchical  clustering  does  not  provide  an  "optimal”  solution,  the 
occasional  resulting  discrepancies  are  easily  identified.  Aside  from  the  two 
high  correlations  already  mentioned,  the  rest  of  the  correlation  matrix  in 
Table  2  shows  the  relative  independence  of  each  job  type. 


Table  3  shows  Che  correlations  between  the  task  families  Identified 
is  the  tvs  separate  analyse;,  nurse  and  medical  service  corpaman.  Unlike 
the  previous  two  matrices,  the  tasks  comprising  any  one  task  family  are 
not  mutually  exclusive  (i.e.,  c own  on  tasks  do  occur  between  task  families 
for  the  nurse  and  medical  service  lamilies).  This  table  merely  identifies 
the  commonality  or  uniqueness  of  each  task  family  previously  Identified. 

In  Table  3,  note  that  three  nurse  task  families  and  three  of  the  medical 
service  task  families  were  unique  (having  less  than  25X  common  variance). 

Prom  the  medical  service  task  families,  the  following  were  uniquely  defined 
as  belonging  to  the  medical  service  erreer  field:  Croup  345  Inventory  teaks. 
Croup  319  Admissions  and  Group  14  Emergency  Room  task*.  These  task  families 
ere  not  only  statistically  unique  from  the  nurse  families  but  have  a  great 
deal  of  logical  and  intuitive  appeal  aa  wall.  Nurses  do  not  perform,  as  a 
rule,  the  relatively  menial  administrative  tasks  what  would  be  associated 
with  admitting  patients  to  the  hospital  or  performing  an  inventory  of  supplies 
(nor  in  the  Air  Force  do  they  have  the  same  responsibilities  or  near  the  same 
numbers  of  personnel  working  in  emergency  rooms) .  (See  McFarland  1975  for 
comparative  analysis  of  job*  performed) 

The  task  families  that  were  identified  as  belonging  uniquely  to  the 
nursing  career  were  Group  283  Irrigation  tasks,  Croup  93  Orthopedics  tasks 
and  Croup  43  Anesthesia  tasks.  These  task  families  generally  represent 
specialized  tasks  for  which  nurses  have  been  trained  and  the  medical  service 
corpsnen  have  not  been  trained  to  perform.  For  example,  the  irrigation  tasks 
are  generally  performed  in  an  operating  room  and  the  Air  Force  has  an  entirely 
different  career  field  for  operating  room  technicians  that  is  totally  Independent 
of  the  medical  service  career  field. 

Overall  however,  high  correlations  were  found  between  task  families  that 
were  independently  identified  from  nurse  and  medical  service  job  inventories. 
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This  is  s  good  reliability  check  on  the  technique  and  clearly  demonstrates 
that,  aside  from  the  six  unique  task  families  identified,  all  other  task 
families  were  consistent  across  these  two  similav,  albeit  separate.  Air  Force 
occupations.  This  stability  is  a  critical  element  if  the  clustering  of 
variables  is  ever  to  be  applied  to  the  myriad  ot  problems  for  which  it 
shows  potential. 

POTENTIAL  APPLICATION  OF  VARIABLE  CLUSTERING  WITHIN  THE  AIR  FORCE 

The  identification  of  task  variable  clusters  that  are  homogeneous 
within  themselves  but  are  independent  and  mutually  exclusive  has  a  great  deal 
of  potential  benefit  in  relating  task  information  to  many  areas  of  management 
interest.  Potential  areas  in  which  task  family  data  might  be  of  benefit 
include  the  development  of  models  for  the  prediction  of  skill  level  require¬ 
ments  for  airmen  positions  and  the  identification  of  aptitude  requirements 
based  on  task  performance  requirements.  Closely  related  to  these  Is  the 
potential  for  job  redesigu.  This  could  be  useful  in  identifying  task 
families  that  require  an  unusually  high  aptitude.  Two  different  career  fields 
might  be  developed,  one  consisting  of  a  few  high  aptitude  personnel,  the  other 
(probably  much  larger)  of  lower  aptitude  personnel. 

To  test  the  potential  of  using  task  family  data  to  predict  skill  level 
requirements  for  Air  Fcrce  jobs,  a  multiple  discriminate  analysis  was  computed 
using  job  incumbents'  self-reported  duty  skill  level  (DAFSC)  ss  a  criterion 
against  six  of  the  task  fumily  time-spent  vectors  as  predictors.  The  use 
of,  DAFSC  as  a  criterion  Is  very  limited.  It  aay  not  reflect  the  eklll  require¬ 
ments  of  the  current  job  because  an  Individual  with  a  suparvlaory  skill  levsl 
may  in  fact  be  performing  a  journeyman  level  job  which  would  not  be  reflected 
in  his  DAFSC.  The  results  of  the  predicted  skill  level  using  standardized 
distance  from  the  centroid  a9  the  measure  of  best  fit  are  presented  in  Table  4. 


The  results.  Although  showing  «  regression  effect  toward  the  mid-point, 
show  the  potential  of  using  task  family  data  for  prediction  of  airmen  skill 
levels.  Had  a  better  criterion  been  available,  it  is  felt  the  results 
would  have  been  even  more  striking  in  showing  the  potential  of  using  task 
family  data. 

CONCLUSIONS: 

Many  questions  are  still  left  unanswered.  As  mentioned  in  the  introduction, 
there  has  been  very  little  psychometric  research  on  cluster  analysis.  Basic 
questions  still  exist  as  to  the  optimal  measures  of  similarity  that  should 
be  used  for  a  cluster  analysis,  and  algorithm  provides  the  most  near  optimal 
clustering  results. 

Cluster  analysis  la  s  simple  and  straightforward  analysis  technique. 

It  does  not  require  the  same  rigorous  definitions  about  the  data  that  are 
required  in  factor  analysis  while  it  still  provides  the  user  a  certain 
interaction  and  freedom  in  defining  the  clusters  of  specific  Interest. 

The  results  of  this  study  clearly  show  the  utility  of  using  a  variable 
clustering  technique  and  the  potential  that  exists  for  its  application  to 
a  wide  range  of  research  problems. 
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TABLE  4  PREDICTED  SKILL  LEVEL  VERSUS  ACTUAL  SKILL  LEVEL 

ACTUAL 


Predicted 

Entry 


Entry 

Journeyman 


Technician 


Journeyman 

86.602 

22.732 

2.272 

Technician 

12.522 

63.372 

18.182 

Supervisor 

.882 

13.902 

79.552 

Total 

100.002 

100.002 

100.002 

N- 

2141 

374 

44 

Cell  Entries  r’  Percentage  of  Cases 


Supervisor 
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As  armed  services*  occupational  data  banks  (MODB,  CODAP,  NOTAP) 
reach  storage  levels  approaching  the  monumental,  recording  of  essentially 
unique  and  specialized  work/duty  experience  data  must  eventually  taper 
off;  Inputs  must  begin  to  show  massive  commonality  among  elements  of  Job 
task  performance:  commonality  Inherent  In  the  task  elements  If  not 
Immediately  apparent  In  task  statements.  Irrespective  of  the  historical 
parochialism  o'  work  domains  and  their  attendant  descriptive  language, 
close  examination  of  performances  underlying  tasks  draws  attention  to 
similarities  In  logic,  application  of  routine,  and  manipulation. 

Technicians  (electronics  and  others)  diagnose,  using  symptoms -ana lysis 
and  option-exercise  techniques  and  logic  not  far  removed  from  such  exercise 
In  medicine.  Draftsmen  draw  on  paper;  carpenters  and  molders  draw  on  wood; 
metalsmiths,  machinists,  and  hull  technicians  draw  on  metals;  others  draw 
on  fabrics.  For  at  least  part  of  their  work,  they  all  draw:  they,  and 
others,  also  drill,  punch  holes  In,  stitch,  pin,  crimp,  nail,  screw,  rivet, 
or  otherwise  attach  fasteners  to  wood,  fabric,  metal,  plastic,  paper  ~ 
even  teeth  and  bones.  Dental  tools  transcend  their  designed  special  appli¬ 
cation,  as  witness  their  use  in  micro-miniature  circuit  repair  and  jeweliy- 
and  model -making. 

Commonal i ties  discovered  In  job/task/skill  analysis  do  not  necessarily 
remain  at  the  same  levels,  nor  are  they  always  obviously  Imilar  throughout 
the  hierarchies  In  which  they  are  found,  but,  when  job-ccimon  Items  are 
Identified  and  their  stratification  determined,  job-unimie  Items  then 
appear  In  equal  clarity. 

The  importance  of  commonality  and,  hence,  the  finiteness  of  Its  dis¬ 
tribution,  lie  mainly  In  Its  potential  for  eventual  use  in  training,  man¬ 
power  management  and  performance  testing.  Commonality  suggests 
transferability  across  and  among  ratings,  NECs,  MOSs,  and  other  work 
categories;  It  Is  a  factor  tu  be  employed  in  building  modularized  train¬ 
ing  programs  and  tests.  Particularly  In  testing,  especially  on  a  broad 
scale,  established  and  identified  commonality  In  task  performance  elements 
maxes  clear  what  specific  work  behaviors  are  being  tested  throughout  a 
test  or  series  of  tests.  Further  Investigation  of  the  Impact  of  discovered 
commonalities  within  and  among  registered  occupational  categories  may 

The  "text  of  thTs  report  Is  not  to  be  construed  as  official  doctrine  of 
the  Department  of  the  Navy,  unless  so  designated  by  other  authorized 
documents . 
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confirm  the  theories  of  some  researchers  that  there  is  an  inherent 
symmetry  to  the  distribution  of  common  work  items.  If  so,  commonalities 
abound  at  the  lower  skill  levels  —  job  entry  or  apprentice  —  and  are 
most  obvious  in  task  statements  related  to  those  Uvels. 

With  the  acknowledgement  of  skill  levels  and  the  recognition  that 
commonality  alone  has  far-reaching  implications,  hierarchies  of  tasks, 
skills,  and  other  associated  work  data  demonstrate  the  impact  of  inherent 
task  or  skill  complexity  as  well.  The  inclusion  of  the  modifier  "Inherent" 
to  describe  complexity  is  for  the  purpose  of  differentiating  “complexity" 
from  "learning  difficulty",  two  terms  which  are  sometimes  argued  as 
being,  If  not  practically  Interchangeable,  then  at  least  reasonably 
close  In  meaning,  especially  within  the  training  community.  The  distinc¬ 
tion  in  terms  becomes  important  In  this  regard:  whereas  the  difficulty 
experienced  In  learning  to  perform  a  task  may  very  well  closely  match 
the  inherent  complexity  of  that  task,  learning  difficulty  can  be  affected 
by  variances  in  training  methodology,  breakthroughs  in  media  application, 
or  by  changes  In  sequencing,  learning  difficulty,  then,  becomes  a 
variable  factor.  At  least  relatively  speaking,  the  complexity  of  a  task 
Is  fixed,  therefore  Inherent.  Operation  of  a  bulldozer,  crane,  or  pipe 
organ  requires  simultaneous  coordinated  use  of  both  hands  and  both  feet; 
at  least  in  a  physical  sense,  hardly  anything  can  change  a  complexity 
factor  pinioned  by  that  fact,  unless  a  significant  change  In  the  assocl 
ated  machine  or  manipulation  requirement  takes  place,  thereby  most 
likely  changing  tse  task  Itself.  Gathering  and  examining  these  and 
numerous  other  such  examples  provided  mounting  evidence  that  here  were 
numerous  opportunities  to  establish  reasonably  fixed  and  almost  universally 
understood  criteria  (comoonality,  complexity,  and  eventually  componency 
and  criticality)  eventually  reducible  to  quantification  and  data  process¬ 
ing  for  front-end  job/task/skill  analysis.  This  realization  led 
researchers  of  the  Navy  Career  Training  Analysis  Group  (CTAG)*  to  a 
re-examination  of  the  structure  of  the  Navy's  world  of  work,  the 
language  used  to  describe  It,  and  the  established  taxonomies  of  job/task/ 
training  analysis  and  the  systems  approach.  That  a  Gestalt  approach  was 
appropriate  became  evident  early:  the  reason  —  the  Navy's  world  of 
work  is  a  highly  complex  structure;  it  does  not  neatly  coincide  with  the 
existing  categories  descriptive  of  work  effort  In  the  other  armed  services 
or  with  the  components  of  the  systems  approach  to  training  (job,  duty, 
task,  task  element,  etc.).  If,  for  example,  "job"  aligns  with  or  equals 
HOS  or  AFSC,  It  will  not  similarly  align  with  a  Navy  rating.  Of  Increasing 
importance  to  the  mechanics  for  conduct  of  front-end  analysis  within  the 
Navy,  tne  available  components  In  the  Instructional  Systems  Development 
or  systems  approach  hierarchies  were  matched  with  those  divisions  of  Navy 
work  effort  that  appeared  to  be  most  closely  parallel  in  meaning:  job, 
task,  and  task  element  were  matched  with  rating.  Naval  Enlisted  Classi¬ 
fication  (NEC),  billet,  and  watch  for  the  most  compatible  pairings.  At 
the  outset,  job  and  rating  appeared  to  be  a  poor  match,  since  the 
Inevitable  conclusion  drawn  from  such  a  pairing  would  be  that  there  are 


♦Pavis.  0.0.  A  Ansbro,  T.N.,  Occupational  Analysis  for  Kavy  Instructional" 
Systems  Oevelogaent  (ISO):  A  Matrix  Approach.  Report  to  KTA  Conference, 1976. 
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only  as  many  enlisted  jobs  In  the  Navy  as  there  are  ratings  (84),  a 
number  clearly  Inadequate  to  describe  the  complexity  and  variability 
of  scope  of  Navy  work,  except  in  the  broadest  terms.  Mi thin  the 
rating  structure  are  more  than  one  thousand  NECs,  classifications  that 
tightly  define  and  describe  work  in  specific  occupational  areas  and  on 
specific  items  of  equipment,  further  aligned  with  the  Navy's  principal 
activity  communities:  surface,  sub-surface,  and  air.  Here  job  to  be 
matched  to  rating,  the  limited  compressibility  of  the  available  taxon¬ 
ometrlc  hierarchy  would  most  likely  force  overloading  of  either  th* 
duty  or  task  levels  in  the  resultant  inventories.  A  further  compv  ca¬ 
tion  would  be  imbalances  inherent  In  realignment  of  work  assignments 
among  the  three  major  activity  communities  and  NEC  distribution  and 
range  among  and  within  the  ratings  themselves.  For  purposes  cf  job/ 
task  analysis.  It  had  to  be  possible  to  catalog  to  the  level  of  single 
discrete  work  behaviors,  or  task  elements;  therefore,  among  the  sub¬ 
ordinate  work  definitions  of  the  Navy,  NEC  appeared  to  be  the  closest 
match  to  job,  frequently  blanketing  billet  and  watch  to  a  satisfactory 
extent.  The  compass  of  NECs  Is  such  that  there  appeared  to  be  some 
Instances  of  redundant  coverage  In  terms  of  assumed  commonalities 
among  representative  Items  of  equipment  operated  and  maintained,  some 
Instances  of  apparent  componency  and/or  personnel  input  spanning 
several  ratings,  some  Instances  of  wide  variety  of  technological 
requirements;  but.  In  the  main,  NECs  appeared  supportable  and  satis¬ 
factorily  definable  as  jobs.  Coupled  with  rating  descriptions, 
their  numbers  appeared  to  reflect  adequately  the  characteristics,  com¬ 
plexity,  and  variety  range  of  Navy  work. 

Since  rating  aligns  abovr  job  In  a  vertical  matching.  It  may  appear 
to  have  been  Ignored,  since  Job  equals  NEC,  duty  equals  responsibility 
grouping  of  tasks,  and  task  equals  task.  Rating  as  a  parent  assembly 
of  jobs/NECs  Is  affected  by  the  results  of  job  analysis,  not  necessarily 
by,  or  during  the  course  of,  the  analysis  itself;  therefore,  it  Is  not 
Ignored,  It  Is  merely  a  category  level  above  those  taxonometrlc  Items 
used  In  the  analysis,  and  It  is  recorded  as  a  parent  assembly.  Further 
modification  to  the  taxonometrlc  array  was  an  attempt  to  maintain  a 
terminology  strict'y  aligned  with  specifically  identified  skills 
behavior.  Accordingly,  In  the  concentration  on  employment  of  perform¬ 
ance  characteristics  only  In  conducting  the  analysis.  It.  proved 
necessary  to  drop  “knowledge"  behavior  out  of  the  categories  In  use. 
Knowledge  behavior  was  then  much  more  specifically  dealt  with  as 
identified  mental  or  Intellectual  skills  and  mental  ski 11 -directed  task 
performance  elements;  the  amended  taxonometrlc  categories  therefore 
become  more  definitive  and  detailed  at  the  task  behavior  level  and 
below.  "Knowledge",  as  a  familiar  catch-all  term  for  sometimes 
undefined  work  behavior  was  minimized,  the  major  bonus  being  that  all 
the  included  work  information  became  quantifiable  performance  data  In 
the  computer.  The  leaner  taxonometrlc  array  remained  In  harmony  with 
ISO  philosophy,  essentially  driving  testing  toward  performance  measurement 
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and  Increasingly  orienting  training  to  performance  objectives  while 
the  door  appeared  to  be  opening  on  consideration  of  eventual  perform¬ 
ance  testing  for  advancement  and  job  certification  in  the  fleet. 

As  Indicated  generally  In  the  Introductory  paragraphs,  the  world 
of  Navy  work  shows  distribution  In  more  than  one  direction.  Complexity, 
difficulty,  componency,  and  other  Interrelationships  drive  the  data 
toward  vertical  hierarchies  commensurate  with  allocation  of  responsi¬ 
bilities,  tasks,  and  component  skills  among  job-entry,  whole-job 
performance,  upper  (advanced)  technician  levels,  and  management, 
herein  described  In  the  CTA6  effort  as:  Basic  Operator/Technician, 
Operator/Technician.  Advanced  Operator/Technician,  and  Manager. 

Craft  or  tradesman  (technician,  operator,  administrator,  etc.) 
boundaries  established  historically  (even  sometimes  maintained  so) 
and  In  consequence  of  weapon,  hardware,  and  platform  Introduction 
and  change  generally  blanket  the  tighter  NEC  structure  which  responds 
not  only  to  hardware  and  the  differing  environments  of  the  community 
(air/sub-surface/surface)  overlays,  but  also  to  other  evidence  of 
uniqueness  In  work.  In  such  a  work  matrix,  commonality  weaves  a  number 
of  horizontal  lines  through  the  patterns.  The  Included  technologies 
tend  almost  Inevitably  to  group  themselves  into  inherent  associations, 
or  families,  of  work  effort,  straining  at  the  traditionally  or  empirically 
Imposed  boundaries.  Essentially,  Navy  maintenance  Is  divided  into 
categories  (Organizational,  Intermediate,  Depot)  that  accomodate  manpower 
management,  staffing  of  maintenance  organizations,  flow/storage/dlstrl- 
butlon  of  spores,  and  return  of  equipment  to  operational  status. 


The  maintenance  categories  establish  clearly  where  (organizationally 
and  geographically)  maintenance  takes  place.  The  nature  of  maintenance 
performed  (or  kind  of  work  actually  done)  falls  substantially  Into  four 
other  categories,  essentially  subsets:  preventive,  corrective,  preser¬ 
vative,  and  replenishment,  only  one  of  which  (replenishment)  appears  to 
be  the  exclusive  property  of  a  single  officially  established  maintenance 
category  (organizational ).  In  addition,  some  maintenance  personnel 
(or  ratings)  perform  throughout  the  entire  spectrum  of  repair  without 
actually  fabricating  any  parts  or  other  items  used  in  rebuilding, 
reassembly,  or  repair;  others  spend  much  of  their  working  time  In  fab¬ 
rication  while  nominally  employed  within  the  maintenance  category.  The 
nature  of  equipment,  hardware,  materials,  and  manufacture  enters  here; 
for  example: 


a.  Precision  -  manufactured  and  tuned  components  must  be  replaced 
In  toto  on  a  go-no-go  basis,  a  characteristic  of  some  repair  work  In 
electronics. 


b.  Equipments  like  engines  are  reassembled  with  precision- 
manufactured  parts  which  may  require  some  honing,  fitting,  or  other 
alterations  during  repair. 


769 


I 


MW  mmmmm  m 


c.  Equipments  like  parachutes  and  other  Items  of  air  delivery/ 
safety  may  require  partial  or  extensive  remanufacture  of  Items  because 
l  of  the  nature  of  their  construction  and  the  characteristics  (wear,  | 

i  durability,  fatigue,  etc.)  of  their  materials.  < 

s'  / 

[  d.  Some  fabrication  of  spares  may  be  required  In  response  to  f 

^  emergency  situations  which  occur  only  on  rare  occasions  and  are  out-  | 

|  side  of  the  normal  and  predictable  maintenance  requirements. 

I 

e.  In  other  applications,  wiring  a  panel  or  soldering  parts 
I  Into  a  circuit,  plastering  or  cementing  over  damaged  masonry,  reolac-  ; 

I  Ing  rotted,  rusted,  or  otherwise  damaged  planks,  piping,  railing, 

|  ladders,  gratings,  etc.  may  be  considered  fabrication;  they  certainly  ; 

|  are  considered  maintenance.  ; 

f  Maintenance  work  generally  includes  operation  of  support  equlp- 

|  ment,  tools,  and  instruments,  a  reasonably  probable  fabrication  of  | 

|  parts  or  other  Items  (gaskets,  seals,  shims,  etc.),  directed  modlfl-  1 

v  cation  of  the  maintained  items,  some  administrative  recording,  order- 

I  Ing,  or  reporting,  operational  testing  or  run-in  of  equipment  uln- 

\  talried,  maryy  inspections  and  decisions  concerning  condition,  quality, 

i  and  operational  status  of  Items  maintained.  Therefore,  mainterance 

work  or  a  maintenance  designation  or  assignment  appears  to  Include: 
operation,  fabr 1 cation  /  and  adml n 1 s t  rati on .  However,  each  of  these 
descriptors  Is  a  primary  work  category  by  ItseTT 

a.  A  builder  primarily  fabricates  and  constructs;  he  only 
Incidentally  maintains. 

b.  An  equipment  operator  primarily  operates  designated  equipment 
(earth-movlng/materlals-handllng;  he  secondarily  maintains  It). 


c.  A  Personnelnan  or  Yeoman  primarily  performs  administrative 
functions;  he  rarely.  If  ever,  maintains  anything  (although  his  job 
description  may  refer  to  "maintaining  files".  It  can  be  considered 
out-of-context  usage). 


It  Is  obviously  Impossible  to  avoid  overlapping  and  general  muddying 
of  the  work  descriptors.  Also,  they  won't  stay  put.  For  Instance, 
medical  personnel  operate  tools  and  instruments  and  conduct  a  great  deal 
of  administrative  reconfs-keeplnq.  They  also  maintain  their  Instruments. 
As  Indicated  in  the  introduction  to  this  paper,  technicians  do  a  great 
deal  of  diagnosis,  drawing,  deciphering,  too<  and  Instrument  manipulation, 
and  just  plain  thinking  and  figuring  that  may  be  more  alike  than  they  are 
different. 


Ongoing  collection  of  work  data  continues  to  demonstrate  commonaii- 
tles  across  as  well  as  among  crafts,  trades,  or  other  established  divi¬ 
sions  of  Tabor,  not  necessarily  at  the  same  levels,  nor  always  obviously 
similar  through  the  hierarchies  in  which  they  are  found. 
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Obviously,  in  any  attempt  to  catalogue  commonality  among  Identi¬ 
fied  areas,  or  families,  of  work  activity,  the  methodology  used  also 
points  out  uniqueness;  thereby  providing  two  useful  categories.  In 
addition,  it  Is  where  (at  what  level  of  skill)  commonalities  are 
found,  that  forms  the  major  Influence  on  employment  of  work  activity 
data.  If  the  general  groupings  of  common  tasks  and  skills  prove  to 
be  essentially  symmetrical ,  that  is,  if  fundamental  work  actions  form 
the  bulk  of  common  Items;  then,  personnel  Input  into  a  particular  work 
force  can  be  general  and  broad-based,  and  the  supportive  training  for 
appropriate  job  entry  could  likewise  be  general.  The  most  beneficial 
result  of  the  discovery  of  such  symmetry  In  the  data  would  be  the 
possibility  of  grouping  work  distribution  and  training  factors  to 
provide  comron  apprenticeships  or  job-entry  qualifications  in  associ¬ 
ated  fields  of  work. 

The  field  of  Inquiry  or  area  from  which  data  Input  was  to  be 
captured  was  arranged  In  broad  categories  of  work  effort  assumed  to 
Include  significant  common  characteristics.  The  principal  purpose  of 
such  a  classification  scheme  overlay  was  to  find  anti  use  associations 
among  the  data  In  such  quantities  that  confident  size,  and  some  Indi¬ 
cations  for  early  payoff  from  use  of  coemonallties  might  be  realized. 
Such  arrangements  of  Ntvy  work  effort  would  incorporate  presently 
identified  jobs  and  other  classifications  Into  families.  The  premise 
Is  not  new.  It  has  been  proposed  In  one  form  or  other  quite  often. 
Howaver,  as  Introduced  here,  it  is  not  a  recommendation  for  a  perma¬ 
nent  reorganization  of  the  Navy  wcrU  of  work.  It  was  rather  Intend¬ 
ed  as  facile  and  appropriate  research  system  machinery  designed  to 
effect  some  initial  desired  results  from  the  CTAG  research  assign¬ 
ment  and  to  aid  in  setting  the  general  parameters  for  the  data  base 
so  that  it  would  maintain' Its  general  characteristics  and  projected 
usefulness  in  a  somewhat  uniform  manner  and  without  major  alteration 
largely  throughout  the  unavoidably  long  run  of  the  Navy  multi -job/ 
task/skill  (and  training)  analysis. 

Data  assembly  could  then  I*  represented  by  families  of  work  effort, 
such  families  characterized  by  clear  similarities  In  job  responsibility; 
representative  equipment,  tools  and  Instruments  used;  similarities  In 
methodology;  obviously  associated  fundamentals  of  technology  (electron¬ 
ics,  hvdraullcs,  mechanics,  etc.);  and  what  technically  expert  experl - 
entlally-derlved  assumptions  could  be  made  to  set  temporary  or 
experimental  boundaries  for  the  families. 

In  general,  the  Navy  world  of  work  breaks  down  Into  the  following 
broadest  categories:  Administration,  Fabrication,  Maintenance,  Opera¬ 
tion,  and  (purelyl  Military  functions.  There  are  obvious  overlaps,  of 
course  (fabrication  and  maintenance,  for  instance);  but  these  categor¬ 
ies  do  identify  oeneral  fields  of  Navy  work  within  which  are  the  familial 
structures  (families)  Introduced  above.  The  familial  boundaries 
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the  field  of  inquiry  was  introduced  and  given  a  trial  run.  It  appeared 
to  reinforce  the  Initial  familial  structure  assumptions.  An  equal 
opportunity  for  eventual  massive  consolidation  may  not  occur  again,  and 
gains  further  downstream  in  the  overall  run  of  the  system  may  well  be 
much  more  modest. 

It  should  be  noted  that  some  of  the  ratings  proposed  experimentally 
for  classification  within  the  electronics  family  (figure  1)  appear  also 
in  the  somewhat  more  loosely  assembled  electro-mechanical  family  (figure 
2).  Such  assembly  adds  further  reinforcement  to  the  premise  on  which 
accumulated  occupational  data  would  be  employed. 

The  primary  work  categories  mentioned  earlier  (Maintenance, 
Fabrication,  Administration.  Operation,  Military)  are  those  recognized 
at  the  present  stage  of  research  to  provide  reasonably  clear  boundaries 
and  sufficiently  broad  areas  to  Include  and  generally  segregate  major 
classes  of  Navy  work  effort  according  to  what  are  determined  to  be  the 
primary  characteristics  of  the  work.  The  principal  value  of  this 
further  venture  into  pre-analysis  classification  was  its  use  in  design¬ 
ing  Individual  matrices  for  the  task  analysis  portion  of  the  entire 
analysis  system.  It  Is  an  Indication  of  the  Inherent  validity  of  such 
classifications  for  the  overall  effort  that  tasks  found  In  maintenance 
show  differing  task  element  sub-structures  than  those  found  In  such  an 
area  as  military  watchstanding  or  administration.  Also,  operational 
ana  administrative  types  of  tasks  add  duty  sub-categories  and  change 
reference  and  other  requirements  on  the  worksheets.  The  nature  of 
reference  material  changes  as  well.  Military  doctrinal  materials, 
regulations,  and  Instructions  tend  to  be  duty  rather  than  task  oriented. 
Rather  than  attempt  to  design  a  single  analysis  matrix  (and  associated 
data-entry  devices  like  Job  Data  Worksheet,  etc.)  to  cover  all  types  of 
task  data  Input  to  analysis  and  computer  coding  entry.  It  was  decided 
to  construct  separate  matrices  for  the  Identified  primary  work  categories. 

in  summary,  pre-analysis  classification  overlaid  on  the  Navy  world 
of  work  serves  as  a  road  map  for  the  analysis  route  through  ratings  and 
NECs,  appears  to  enhance  cumulative  and  orderly  acquisition  and  early 
eyloyment  of  work  data,  and  facilitates  ongoing  refinement  of  tfie  tools 
of  analysis.  In  no  way  does  such  pre-analysis  classification  freeze  the 
accumulated  data  into  categories  that  may  later  prove  restrictive, 
unwieldy,  or  inappropriate"  It  merely "plots  the  route  and  sels  the 
course  forlhe  ongoing  effort.  The  data  base  can  be  cross-coded  to 
facilitate  updated  task  commonality  printouts,  almost  concurrent  with 
task  data  accumulation.  At  the  present  stage  of  research,  this  has 
most  recently  been  accomplished,  although  not  yet  in  final  form.  Should 
category  overlaps  prove  obstructive  or  assumed  associations  wear  thin, 
adjustments  can  be  made  without  harmful  effect  on  the  analysis.  Restruc¬ 
ture  Is  always  possible,  especially  with  the  Insights  forthcoming  from 
accumulated  experience  with  the  run  of  the  system  Itself. 
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engendered  by  the  term  “electronics"  successfully  cover  associated 
methodology,  theory  of  operation,  equipment,  tools.  Instruments, 
references,  materials,  and  assumed  technical  background.  So  too, 
do  those  described  by:  communications,  weapons  control,  propulsion, 
navigation,  and  detection  —  families  and  family  sub-structures  of 
Navy  work.  They  divide  (in  terms  of  functional  field)  generally  into 
operational  and  maintenance  activity,  and  these  distinctions  are 
clear  and  not  obstructive  to  familial  grouping. 

Why,  when  there  are  already  such  established  Navy  work  and  personnel 
classification  entitles  as  rating,  NEC,  billet,  watch  specialty,  and 
community.  Is  It  necessary  or  advisable  to  Include  yet  another  work 
classification;  and  an  overlay,  at  that?  Because  overlaying  a  familial- 
structure  grouping  gathers  together  those  ratings  that  can  be  assumed 
to  encompass  the  associated  methodologies,  theories  of  operation, 
equipment,  etc.,  that  can  be  termed  a  Navy  Industrial  family.  For 
example,  see  figure  1,  a  listing  of  ratings  assembled  Into  the  assump- 
tlvely  established  "electronics"  family.  Ratings  In  this  family  appear 
generally  to  divide  Into  major  areas  of  employments  (navigation, 
detection,  etc),  or  family  sub-structures,  which  are  further  divided 
Into  two  modes:  operator  and  maintenance.  These  familial  structures 
suffice  to  set  boundaries  around  twenty-six  ratings  that  function  In 
one  or  both  modes  in  the  area  described  by  electronics. 

Conducting  analysis  progressively  across  and  among  the  ratings 
and  NECs  within  this  family  before  movlno  the  research  to  another 
group  of  ratings  has  the  advantage  of  (1)  proving  or  disproving  the 
assumed-association  basis  of  establishing  the  family,  and  (2)  .yaking 
maximum  use  of  commonalities  discovered  for  the  purpose  of  designing 
common Job-entry  manpower  Input  and  associated  training  programs. 

The  data  base  so  constructed  tends  to  be  homogeneous,  and  organiza¬ 
tionally  compresses  the  distance  between  Initiation  of  analysis  and 
roughlng-out  career  management  and  training  program  recownendaMons. 

Examination  of  Figure  2  will  show  a  further  extension  of  familial 
structuring  --  an  electro-mechanical  grouping.  At  this  point  in 
assembling  the  associated  ratings  here,  no  attempt  was  made  to  balance 
the  Involvement  of  the  ratings  among  the  modes  and  sub-structures. 

The  groupings,  as  here  demonstrated,  represent  a  first  cut  at  assembly  — 
an  admittedly  assumptive  grouping  Intended  to  enhance  data  gathering  and 
programmed  retrieval  with  a  strong  orientation  toward  examining  such 
commonalities  as  surfaced.  In  an  experimental  grouping  across  Navy 
ratings.  It  was  determined  that  the  twenty-six  ratings  identified  as 
electronics-associated  presented  most  fruitful  prospects  for  early  and 
substantial  payoff  In  terms  of  projected  or  at  least  viewed  possibility 
of  consolidation  of  job-entry  training  and  non-NEC-specific  training 
beyond  the  first  enlistment,  with  constant  manpower  management  and 
enlisted  career  program  development  opportunities  as  well.  Further,  a 
decision-making  model  to  Influence  determination  of  the  structure  of 
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Symposium:  USAF  Occupational  Measurement  Center  Programs 

Chairman:  Colonel  James  A.  Turner,  Jr. 

Commander 

USAF  Occupational  Measurement  Center 
Lackland  AFB,  Texas  78236 


This  symposium  outlines  the  ongoing  programs  of  the  USAF  Ocavitlonal 
Measurement  Center  -  construction  of  occupational  tests  in  support  of  the 
Weighted  Airman  Promotion  System  and  the  USAF  Occupational  Survey  Program. 
These  major  missions  will  be  outlined  in  detail  and  recent  ticnd>  and 
developments  will  be  dlcussed.  In  addition,  the  interaction  of  these 
programs  will  be  examined.  In  some  respects,  the  occupational  survey 
program  can  be  considered  as  the  initial  step  in  ascertaining  Air  Force 
utilization  policy  for  each  specialty  area.  Occupational  Survey  Reports 
are  used  In  utilization  conferences  which  result  In  validation  or  modifi¬ 
cation  of  Special  Training  Standards,  Technical  Training,  Career  Develop¬ 
ment  Courses,  On-The-Job  Training,  and  eventually  In  changes  to  the  Specialty 
Knowledge  Tests.  Thus  USAFOMC  Is  Involved  in  both  ends  of  the  process  and 
is  in  a  unique  position  to  make  a  substantial  contribution  to  the  Air  Force 
personnel  subsystem. 

Presentations  In  this  symposium  Include: 

Overview:  USAF  OMC  Organization  and  Missions 
Capt  C.  D.  Gorman 
Management  Applications  Section 

Trends  In  Tost  Development 
Capt  J.  R.  Johnson 
Test  Construction  Section 

USAF  Occupational  Survey  Program 
Mr  J.  8.  Keeth 

Airmen  Career  Areas  Analysis  Section 

Management  Applications  and  Specie*  Projects 
Major  S.  D.  Stephenson,  Chief,  Officer  Survey  &  Management 
Applications  Section 

The  Interface  Between  Occupational  Surveys  and  Test  Construction 
Capt  David  Vaughan 
ATC  Technical  Applications  Center 
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THE  USAF  OCCUPATIONAL  MEASUREMENT  CENTER:  AN  ORIENTATION 

by 

Captain  Charles  D.  Gorman 
Occupational  Survey  8ranch 
USAF  Occupational  Measurement  Center 
Lackland  AFB  TX 


The  purpose  of  this  paper  Is  to  provide  an  overview  of  the  activities 
of  the  United  States  Air  Force  Occupational  Measurement  Center,  located  at 
Lackland  Air  Force  Base  In  San  Antonio,  Texas.  The  center's  mission  as 
depicted  by  the  organizational  emblem,  Is  to  measure  the  job  and  measure 
the  people  performing  the  job.  More  specifically,  the  center  Is  charged 
with  conducting  Air  Force  occupational  surveys  and  developing  personnel 
tests  In  support  of  the  Weighted  Airman  Promotion  System. 

The  manning  posture  of  the  center  Is  unique.  There  are  approxi¬ 
mately  equal  numbers  of  military  and  civilian  people,  with  57X  of  the 
center's  personnel  Identified  as  professional.  Most  of  these  professionals 
hold  advanced  degrees,  primarily  In  the  behavioral  sciences. 

The  center  is  comprised  of  four  branches.  The  survey  branch  performs 
all  work  associated  with  the  operational  Air  Force  Occupational  Survey  Pro¬ 
gram.  The  support  branch  provides  the  many  functions  of  an  orderly  room  and 
also  maintains  one  of  the  largest  technical  and  documentary  reference 
libraries  on  the  base.  The  test  development  branch  Is  responsible  for  major 
test  revision  and  test  research.  Support  of  testing  programs  and  minor 
test  revisions  are  provided  by  the  test  services  branch. 

The  mission  of  the  Occupational  Survey  Branch  Is  three-fold.  First, 
surveys  of  airman  specialties  are  accomplished  on  a  recurring  basis,  approx¬ 
imately  once  each  four  years.  A  new  system  for  maintaining  up-to-date 
task  lists  for  job  specialties,  called  the  Current  Task  Inventory  Bank,  will 
provide  the  center  with  the  capability  to  survey  more  frequently  If  necessary. 
The  second  mission  responsibility  Is  to  survey  officer  utilization  fields. 
These  surveys  are  accomplished  on  a  selective  basis  depending  upon  re¬ 
quests  from  major  using  commands  or  Air  Staff  agencies.  Finally,  the 
branch  provides  occupational  survey  data  relevant  tc  specific  manage¬ 
ment  problems  throughout  the  federal  government.  These  special  projects 
are  accomplished  on  an  as-requested  basis. 

The  occupational  analysis  process  Involves  four  steps.  The  first  Is 
the  development  of  a  comprehensive  list  of  all  the  tasks  that  may  be  per¬ 
formed  by  an  Individual  In  the  specialty  being  surveyed.  The  second  step 
Involves  a  validation  of  the  job  Inventory  by  subject-matter  specialists 
stationed  at  operational  units  worldwide.  The  third  step  Involves  adminis¬ 
tration  of  the  job  inventory  to  job  Incumbents,  usually  through  the  personnel 
office  at  the  local  Installation.  The  final  step  In  the  process  Involves 
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analysts  of  results.  Data  collected  with  the  job  inventory  ir<  processed 
by  computer  through  the  use  of  the  Consolidated  Occupation*!  Data  Analysis 
Program,  or  COGAP,  developed  ly  tba  Air  Fore*  Hunan  Resources  laboratory* 
and  art  thtn  Inttrprtttd  by  an  occupational  analytt  who  prtparts  tba  rtport. 

Air  Foret  occupational  analysis  provldts  stvtral  typts  of  Information. 
For  example,  sptclflc  tasks  that  ptrsonntl  ptrforn  In  accomplishing  thtlr 
job  and  tba  Ittns  of  equipment  they  ust  or  nalntaln  can  bt  Idtntlfled.  Task 
analysis  provldts  data  that  allow  tht  Idtntlflcatlon  of  tht  jobs  that  art 
performed  by  graduates  of  Air  Treinino  Connand  courses.  In  addition*  tht 
data  provide  Information  which  identifies  the  progression  of  work  that 
occurs  within  each  specialty.  Finally*  task  analysis  provldts  data  which 
can  bt  used  to  assess  the  relative  difficulty  of  jobs  within  a  specialty* 
tht  relative  difficulty  of  tasks,  and  the  critical  nature  of  Individual 
tasks. 

The  final  occupational  survey  report  Is  sent  to  a  number  of  staff 
agencies  who  utilize  the  Information  In  a  number  of  ways.  One  use  of 
the  data  is  to  determine  If  the  existing  classification  structure  Is 
appropriate.  Officials  at  the  Air  Force  Military  Personnel  Center  use 
the  data  In  conjunction  with  other  Information  to  restructure  career 
ladders.  The  data  are  utilized  by  the  other  mission  element  of  the  center 
to  aid  In  the  development  of  promotion  tests.  All  of  the  task  analysis 
data  are  processed  and  stored  at  the  Air  Force  Human  Resources  laboratory 
and  are  available  for  their  use  in  many  different  areas  of  personnel  re¬ 
search.  Th?  prime  use  of  occupational  survey  data  In  the  recent  past  has 
been  In  the  determination  of  training  requirements  and  in  support  of  the 
Instructional  system  development  process. 

To  summarize  the  Air  Force  Occupational  Survey  Program,  It  Is  a 
program  which  provides  data  which  help  decision-makers  determine  what  to 
train;  when  to  provide  additional  training  and  what  that  training  should 
consist  of;  the  classification  structure  which  will  facilitate  mission 
accomplishment ;  and  finally,  how  work  should  be  designed. 

let's  turn  now  to  the  other  half  of  the  center's  dual  mission* 
developing  promotion  tests.  The  idea  of  using  a  weighted  formula  for 
promotion  purposes  was  implemented  only  In  1970,  Its  purpose  was  to  provide 
a  visible  system  so  the  airman  could  tee  his  relative  standing  In  promo¬ 
tion  competition  and  ensure  more  equitable  promotion  opportunities  among 
enlisted  personnel  throughout  the  Air  Force.  Support  of  the  Weighted 
Airman  Promotion  System  Is  the  job  of  two  of  the  center's  branches. 

One  key  promotion  instrument  is  the  specialty  knowledge  test,  or 
SKT,  which  occupies  the  efforts  of  the  center  52  weeks  a  year.  At  p^swnt, 
test-*  for  approximittly  230  career  fields  and  their  shreds  are  revised  at 
least  once  annually.  We  also  write  the  promotion  fitness  examination*  or 
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PFE,  which  Is  administered  to  400,000  airmen  each  year. 

The  factors  and  weights  that  were  originally  developed  to  support 
the  Weighted  Alrmiir  Promotion  System  concept  have  recently  been  revall-  * 
dated  at  the  Air  Force  Human  Resources  Laboratory.  These  factors  have 
been  found  to  result  In  highly  similar  decisions  to  those  rendered  by 
actual  promotion  boards.  As  can  be  seen  In  Table  1,  testing.  In  the 
fora  of  the  first  two  factors,  accounts  for  42  percent  of  an  airman’s 
total  possible  promotion  points.  Notice  that  the  first  two  factors  are 
the  only  ones  over  which  an  Individual  has  control.  With  this  In  mind, 
the  Importance  the  Air  Force  places  In  having  the  best  possible  tests 
should  be  oovfous. 

Non-commissioned  officers  Air  Force-wide  have  almost  unanimously 
supported  the  Weighted  Airman  Promotion  System.  A  major  reason  for 
this  approval  has  undoubtedly  been  the  concept  of  tests  “written  by 
airmen  for  airmen."  The  center's  policy  Is  to  solicit  only  the  most 
highly  qualified  Individuals  to  write  these  tests. 

Two  types  of  test  construction  projects  are  necessary  to  allow  the 
flexibility  to  meet  Air  Force  goals  of  a  new  test  for  each  promotion 
cycle.  A  major  revision  is  completed  as  necessary,  or  at  least  once 
every  two  years.  The  minor  revision  is  completed  when  a  major  revision 
would  be  Inappropriate  or  not  feasible.  The  center  asks  for  and  normally 
receives  particularly  good  support  from  the  Air  Training  Command  both  In 
the  form  of  career  development  courses,  or  CDCs,  needed  to  write  the  SKTs 
and  the  CDC  writer  or  a  highly  qualified  subject-matter  specialist  from 
the  school.  The  C0C  writer  Is  always  an  Invaluable  asset,  since  the  CDCs 
are  used  predominantly  as  the  sole  source  reference  for  SKT  development. 


TABLE  1 


WEIGHTED  AIRMAN  PROMOTION  SYSTEM 

MAXIMUM 

PERCENTAGE 

FACTOR 

POINTS 

VALUE 

SKT  Score 

100 

21% 

PFE  Score 

100 

21% 

Time  In  service  Score 

40 

9% 

Time  in  grade  Score 

60 

13% 

Airman  Performance  Rating 

Score  135 

30% 

Decorations 

25 

6% 

TOTAL 

460 

100% 
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Another  selection  Instrument  that  does  not  fall  Into  the  promotion 
category  but  which  affects  thousands  of  enlisted  personnel  each  yetr  Is 
the  Apprentice  Knowledge  Test,  or  AKT.  This  device  Is  not  Intended  to 
measure  performance  on  the  job.  The  results  are  used  with  other  criteria 
such  as  experience,  training,  and  supervisory  and  command  recommendations, 
to  determine  whether  or  not  the  Individual  has  the  knowledge  necessary  to 
bypass  basic  technical  training  In  his  specialty.  The  AKT  has  proven  to 
be  quite  a  versatile  Instrument  over  the  years.  Upgrading  to  the  appren¬ 
tice  level  has  continued  to  be  the  most  common  purpose  for  which  the  test 
Is  administered. 

Work  on  a  test  development  project  actually  begins  months  before 
subject-matter  specialists  arrive  at  the  center.  The  scheduling  section 
works  with  the  major  command  to  obtain  the  best  qualified  specialists  for 
the  test-writing  team.  Team  members  must  he  master  sergeant  selectees  or 
higher  in  grade. 

The  cycle  begins  with  team  members  taking  the  current  tests.  This 
Is  done  for  two  reasons:  First,  to  allow  team  members  to  see  what  type 
of  questions  comprise  an  SKT:  and  second,  to  spot  any  faulty  or  obsolete 
questions.  Usually  a  team  consists  of  four  people.  Teams  representing 
specialties  with  shreds  may  have  more  team  members. 

The  next  step  of  the  test  development  process  is  development  of  a 
test  outline.  The  outline  is  a  reflection  of  the  major  paragraphs  In 
the  specialty  training  standard.  The  occupational  survey  Is  also  used 
at  this  point  to  determine  how  many  questions  should  be  written  on  each 
major  task  area. 

Each  test  construction  project  Is  supervised  and  directed  by  a 
test  construction  psychologist  who  Is  either  an  officer  or  a  civilian 
professional.  A  major  source  for  questions  is  the  current  test;  however, 
a  minimum  number  of  new  questions  must  be  written.  Teams  begin  by 
writing  a  question.  They  submit  It  to  the  test  psychologist  who  either 
accepts  it  as  1$,  rejects  It,  or  modifies  It.  The  test  psychologist  bases 
his  decisions  on  psychometric  principles. 

Once  the  team  and  the  test  psychologist  are  satisfied  with  a  question, 
they  pass  It  on  to  the  review  psychologist,  who  has  had  considerable  prior 
experience  as  a  test  psychologist.  The  review  psychologist  will  look  at 
the  question  and  either  accept  It  or  send  it  back  to  *he  team  for  modifi¬ 
cations.  A  review  psychologist  can  look  at  a  question  more  objectively 
since  he  has  not  been  Involved  In  the  Initial  writing  process.  After  he 
has  approved  the  question,  he  sends  It  to  be  typed.  Questions  are  typed 
on  a  magnetic  tape  selectrlc  typewriter.  This  procedure  provides  easy 
Item  retrieval  and  also  allows  corrections  to  be  made  easily.  Portions  of 
the  test  may  also  go  to  the  Illustration  department,  which  Is  frequently 
called  upon  to  produce  technical  drawings  for  some  of  the  test  questions. 
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After  all  the  questions  have  been  typed*  the  team  picks  the  best  100 
questions  out  of  a  pool  of  125  that  they  have  developed.  The  extra  25 
questions  are  called  alternates.  The  100  questions  are  placed  In  the 
order  In  which  they  will  appear  on  the  final  test  and  sent  to  typing 
where  a  test  manuscript  Is  produced.  This  manuscript  Is  reviewed  during 
a  process  called  the  .vaster  review.  The  subject-matter  specialists,  test 
psychologist,  and  review  psychologist  are  all  present  during  the  master 
review.  Each  question  and  Its  choices  are  read  aloud.  It  may  be  neces¬ 
sary  at  this  time  to  substitute  an  alternate  for  one  of  the  original 
questions.  Once  the  team  Is  satisfied  with  the  test,  they  sign  the  manu¬ 
script. 


The  test  psychologist  and  review  psychologist  again  review  the 
manuscript,  this  time  checking  for  any  grammatical  or  spelling  errors 
that  may  have  been  missed.  The  manuscript  Is  then  passed  on  to  the 
third  level  of  control,  a  senior  review  psychologist.  The  center  has 
eight  of  these  senior  reviewers,  and  they  are  responsible  for  monitoring 
career  fields  with  respect  to  test  development.  Once  a  senior  reviewer 
Is  satisfied  with  the  manuscript,  he  passes  It  on  to  typing  where  a 
camera-ready  copy  Is  made.  This  copy  Is  reviewed  for  typing  errors  and 
then  sent  to  the  publishers.  After  the  test  has  been  published,  It  must 
be  reviewed  a  final  time  for  printing  errors  before  It  Is  released  to  the 
field. 

The  total  project  concept  is  emphasized  by  the  compatibility  between 
testing  and  training.  These  two  factors  are  put  on  a  common  basis  by  the 
use  of  the  same  documents  as  the  input  source  for  developments  In  both 
areas.  As  you  can  see  In  Figure  1,  the  outline  used  for  test  projects 
Is  developed  with  familiar  documents.  Including  the  Air  Force  Manual 
39-1  job  description,  specialty  training  standard,  and  occupational 
survey  report.  The  sole  reference  a  non-commissioned  officer  Is  normally 
asked  to  study  Is  the  most  up-to-date  version  of  his  career  development 
course.  Thus,  when  he  takes  his  SKT,  he  should  have  a  feeling  of  conti¬ 
nuity  and  significant  understanding  of  the  material  being  tested  —  much 
of  It  being  common  to  his  previous  technical  school  training,  skill  up¬ 
grading  exam,  and  subsequent  study.  Ther*f*re,  a  degree  of  compatibility 
between  testing  and  training  has  been  ree-t^red. 

The  dual  missions  of  the  Air  Force  Occupational  Measurement  Center 
also  complement  one  another.  Data  from  the  Occupational  Survey  Program 
help  Insure  that  airmen  receive  the  best  and  most  relevant  training  for 
their  specialty;  and  promotion  tests  developed  at  the  center  help  Insure 
that  those  airmen  who  best  assimilate  the  knowledge  of  their  specialty 
are  the  airmen  who  are  promoted.  At  the  Occupational  Measurement  Center, 
we  like  to  think  of  It  as  "Management  through  measurement  —  the  basis 
for  the  best." 
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FIGURE  1.  Training  -  Testing  Compatibility 


TRENDS  IN  TEST  DEVELOPfENT 

by 

J.  Roger  Johnson,  Captain,  USAF 
USAF  Occupational  Measurement  Center 
Lackland  Air  Force  Base,  Texas 


i.  introduction 

The  test  development  function  of  the  Occupational  Measurement  Center 
is  tasked  with  the  development  of  examinations  which  are  used  for  occupa¬ 
tional  placement  or  promotion  within  the  Air  Force,  These  tests  are 
written  by  teams  of  subject-matter  specialists  (SMSs)  selected  for  their 
experience  in  the  career  field  for  which  they  are  writing.  These  teams 
work  in  consultation  with  experts  in  general  test  writing  technique. 

In  retrospect  over  the  last  two  to  three  years,  four  trends  in  this 
test  development  function  have  emerged:  first  (1),  increased  attention 
to  conformance  of  the  test  construction  process  with  the  Uniform  Guide¬ 
lines  on  Employment  Selection  Procedures  established  by  the  Equal  Employ¬ 
ment  Opportunity  Council  (EEOC);  second  (2),  increased  interaction  with 
the  Air  Force  training  function;  third  (3),  an  expansion  of  the  test 
development  role  to  bevond  that  of  the  active  duty  military  service 
member;  and  finally  (4)  an  increased  cost-effectiveness  in  test  develop¬ 
ment  as  a  response  to  an  austere  budgetary  environment. 


II.  CONFORMANCE  TO  EEOC  GUIDELINES 

Bqiloyment  systems  across  the  United  States  are  finding  themselves 
increasingly  involved  with  grievences  regarding  the  validity  of  examina¬ 
tions  used  for  euployvjent  selection  and  promotion.  While  Title  VII  of 
the  Civil  Rights  Act  of  1964,  as  amended,  does  not  presently  pertain  to 
Air  Force  military  personne1  per  se,  we  have,  nonetheless,  taken  several 
proactive  research  and  procedural  initiatives  to  verify  and  insure  that 
the  Air  Force  test  construction  process  remains  fair,  unbiased,  and  in 
confomance  with  EBOC  Guidelines. 

Incorporation  of  Occupational  Survey  Data.  A  study  wrs  initiated  by 
which  selected  SMSs  were  assigned  to  relate  occupational  survey  tasks  to 
respective  test  outlines.  Results  of  the  study  are  expected  to  facilitate 
the  systematic  conversion  of  job  analysis  data  into  test  construction. 

At  last  year's  annual  convention,  Captain  David  Vaughan  reported  a  methodo¬ 
logy  that  can  eventually  be  used  to  translate  occupational  survey  data 
directly  into  test  outline  weights.  These  weights  would  be  used  to  deter¬ 
mine  the  number  of  test  items  to  be  written  on  various  specialty  knowledge 
topics  within  a  particular  career  field.  This  incorporation  of  occupational 
survey  data  into  the  test  development  process  will  enhance  the  job  related¬ 
ness  (i.c.  content  validity)  of  our  examinations. 


Attention  to  Potential  Discriminatory  Factors,  A  literature  review 
has  been  completed  regarding  the  factors  that  contribute  to  biased 
examinations.  Subsequently,  a  study  was  initiated  to  review  test  data 
for  identification  of  possible  et’inic  group  or  sex  differences  in  test 
scores.  Further,  test  construction  teams  are  instructed  to  avoid  the 
use  of  masculine  and  feminine  pronouns  that  tend  to  arbitarily  assign 
stereotyped  roles.  This  policy  supports  the  Air  Force  directive  to 
eliminate  sex  distinctions  in  Air  Force  publications.  In  addition,  a 
selection  procedure  has  been  implemented  for  members  of  the  teams  tasked 
with  the  development  of  our  professional  military  and  supervisory  upgrade 
and  promotion  examinations.  The  pro ;c lure  is  based  upon  major  comaand 
nominations  which  are  screened  by  the  Air  Force  Military  Personnel  Center. 
This  procedure  ensures  an  equitable  career  field,  ethnic  group,  and  gender 
representation  among  those  te^m  members. 

Investigation  into  Spanish  Translations.  A  project  was  staffed  to 
consider  the  possibility  of  translating  our  tests  into  Spanish.  In  re¬ 
searching  background  information  on  the  subject,  many  positive  and  negative 
aspects  associated  with  translation  of  tests  were  identified.  Primarily, 
the  concept  of  testing  in  the  9panish  language  for  promotion  to  E-5  and 
E-7  was  considered  contrary  to  the  basic  philosophy  of  the  promotion  system. 
However,  the  study  did  offer  alternative  solutions  related  to  the  problem 
of  language  proficiency  standards  of  Spanish- speaking  origin  Air  Force 
personnel.  One  such  alternative  was  to  identify  individuals  having 
trouble  in  the  mastery  of  the  English  language  very  early  in  their  careers 
and  to  provide  them  with  the  required  remedial  training. 


Promotion  Opportunities  of  Detached  Air  Force  Personnel.  At  the  request 
of  the  Defense  Coamuhicatlons  Agency  (DCA) ,  a  study  was  conducted  regarding 
the  promotion  opportunity  of  Air  Force  personnel  assigned  to  duty  with 
the  DCA.  7he  concern  was  that  DCA  personnel  may  be  obtaining  lower  test 
scores  than  non-DCA  airmen.  However,  the  results  of  the  study  indicated 
that  there  was  no  significant  difference  between  the  groups.  Such  studies 
ensure  equitable  promotion  opportunities  for  Air  Force  personnel  wherever 
they  may  be  assigned. 


INCREASED  INTERACTION  NITH  TRAINING  FUNCTIONS 


The  test  development  function  of  the  Occupational  Measurement  Center 
has  traditionally  supported  the  personnel  classification  function  of  the 
Air  Force,  initially  with  skill  upgrade  examinations  and  later  with  pro¬ 
motion  tests.  However,  in  recent  years,  there  has  been  increasing  inter¬ 
action  with  the  Air  Force's  training  functions. 

Specialty  Training  Standards.  Since  1971,  test  outlines  which  are 
developed  to  guide  test  writing"~have  been  required  to  be  in  conformance 
with  the  published  standards  which  guide  training  for  each  Air  Force 
Specialty.  In  addition,  critiques  regarding  the  adequacy  of  the  training 
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standards  are  made  by  the  SMSs  who  are  assembled  for  test  writing, 
critiques  are  forwarded  directly  to  training  OPRs  for  reaction. 


These 


Career  Development  Courses.  Whenever  available,  test  reference 
material  is  drawn  from  the  Career  Development  Courses  (CDCs)  published  by 
the  Air  Force's  Extension  Course  Institute  (ECI).  Existing  policy  supports 
the  development  and  use  of  CDCs  as  the  sole  sourco  for  test  references 
for  most  career  fields.  Here  too,  critiques  regarding  the  adequacy  of 
CDCs  are  made  by  the  SMSs  and  forwarded  to  OPRs  for  reaction.  At  a  re¬ 
cent  conference  convened  by  ECI,  the  impact  of  modular  extension  course 
materials  was  discussed.  Such  materials  would  have  coanon  volumes  across 
many  career  fields.  If  implemented,  this  would  enable  the  developuKit  of 
test  item  pools  for  those  career  fields  with  common  reference  materials. 

ECI  personnel  have  also  visited  the  Occupational  Measurement  Center  as 
part  of  a  study  co  determine  the  extent  of  correlation  between  the  content 
and  scores  of  w  promotion  tests  and  their  end-of -course  examinations. 
Initial  findings  indicate  little  content  overlap. 

Export  of  Test  Writing  Expertise.  During  the  last  four  months,  the 
Occupational  Measurement  Center  has  dispatched  test  psychologists  to 
four  technical  training  centers  to  participate  in  local  seminars  on  test 
development.  Rules  for  item  writing,  use  of  item  statistics,  outline 
development,  quality  control,  and  test  validity  were  topics  of  discussion. 
This  exchange  of  expertise  is  intended  to  help  enhance  the  overall  test 
development  process  of  the  Air  Force. 

IV.  THE  EXPANDING  TEST  DEVELOPMENT  ROLE 

Although  the  Occupational  Measurement  Center's  mission  has  traditionally 
been  mandated  toward  Air  Force  active  duty  military  service  members,  that 
role  is  expanding. 

Tests  for  the  Guard  and  Reserve.  During  the  early  mid- seventies,  the 
center  assisted  In  the  development  of  a  weighted  promotion  screening  system 
for  the  Air  National  Guard  and  the  Air  Force  Reserve.  Similar  to  the  Air 
Force,  this  proposed  system  included  a  component  derived  from  an  individual's 
score  on  a  specialty  knowledge  test.  Implementation  plans  and  procedures 
have  been  developed  for  adoption  of  our  active  duty  promotion  tests  to 
Guard  and  Reserve  use.  Presently  the  project  is  in  a  deferred  status  due 
to  Guard  and  Reserve  bugetary  limitations.  Per  the  last  crmaunication 
received,  a  re-evaiuation  of  the  system  is  called  for  during  the  1978 
fiscal  year  (FT)  budget  cycle  with  intended  implementation  in  FY  79. 


Tests  for  Air  Force  Civilian  Personnel.  Since  October  1976,  the 
Center  Ims  particlpit^ln  a  series  of  conferences  and  discussions  on  the 
possible  use  of  our  promotion  tests  for  civilian  wage  grade  inservice 
placement  actions.  Such  considerations  as  test  security,  the  potentially 
serious  impact  of  test  compromise,  and  the  inco^>al*bility  between  military 
and  civilian  occupational  tasks  led  to  the  conclusion  that  wholesale  adoption 
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of  our  military  promotion  tests  for  civilian  use  would  be  inappropriate. 
However,  the  possibility  of  the  center  assisting  in  the  development  of 
separate  civilian  tests  appears  to  be  a  viable  alternative.  At  this  time, 
dialogue  continues  on  the  subject  with  possible  experimental  test  develop¬ 
ments  within  the  next  year. 

Coast  Guaid  Workshops.  Over  the  past  three  years,  the  U.S.  Coast 
GuardPhas  periodically  invited  center  representives  to  conduct  workshops 
at  their  training  center  in  Petaluma,  California.  The  workshops  have 
provided  training  in  test  writing  procedures  to  Coast  Guard  instructors 
in  grades  E-5  to  E-7. 


V.  INCREASED  COST-EFFECTIVENESS 

In  conformance  with  the  austere  budgetary  environment  in  which  military 
agencies  are  required  to  function,  several  actions  have  been  taken  to  en¬ 
sure  the  most  efficient  and  cost-effective  test  development  system  possible. 

Test  Construction  Procedural  Innovations.  In  the  last  year,  several 
procedural  innovations  in  test  constructlOThave  contributed  to  increased 
efficiency.  For  example,  maxty  career  fields  have  common  tasks  dealing 
with  maintenance  management.  Given  this  caunonality,  a  maintenance 
management  item  pool  has  been  established  to  facilitate  test  writing  in 
that  area  for  those  specialties  concerned.  Other  innovations  include 
changing  the  stagger  in  the  scheduled  arrival  of  teams  in  order  to  even 
tlie  workload  on  our  word  processing  section,  reducing  the  number  of  re¬ 
quired  alternate  test  items,  and  excluding  from  periodic  revision  those 
3- level  upgrade  examinations  which  have  a  low  frequency  of  use.  During 
the  last  year,  the  net  effect  of  those  innovations  has  reduced  by  51 
the  duration  of  test  construction  projects. 

Validation  of  Test  Question  Formats.  A  study  was  conducted  to  examine 
the  effectiveness  of  test  question  format  procedures  (e.g.  preferences  of 
positive  items  over  negative,  multifactor  over  single  factor,  situation 
based  over  non-situation  based,  and  open  stem  over  closed  stem).  Exper¬ 
imental  tests  have  been  devised  and  administered  and  data  has  been  collected. 
The  pending  results  will  ascertain  the  validity  of  these  format  preferences 
and  justification  of  the  time  and  effort  expended  in  developing  them. 

Lateral  Trainee  Testing.  Presently,  most  airmen  who  transfer  from 
one  Areer  field  to  another  attend  a  basic  level  technical  school  If 
available  in  the  new  career  field.  However,  it  has  been  hypotehesized 
that  many  of  these  students  sny  already  possess  enough  basic  knowledge 
and  skill  to  warrant  by-passing  the  basic  school.  It  was  therefore 
suggested  that  the  Apprentice  Knowledge  Test  (given  to  selected  airmen 
entering  the  service)  be  administered  as  a  screening  device.  Center  per¬ 
sonnel  studied  tite  issue  and  results  indicated  that  in  most  specialties, 
many  lateral  trainees  should,  indeed,  pass  their  particular  Apprentice 
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Knowledge  Test  if  mandatory  administration  was  required.  While  action 
is  pending  further  evaluation  of  the  criteria  for  minimum  passing  scores, 
the  study  appears  to  support  the  expectation  for  a  substantial  savings  in 
formal  training  resources. 

Evaluation  of  Civilian  Contract  Test  Writing.  The  center  reviewed  the 
possibility  of  using  civilian  contract  test  construction  for  some  special 
occupational  areas.  However,  the  preference  for  the  policy  of  test  con¬ 
struction  "by  airmen,  for  airmen"  via  the  subject-matter  specialists  and 
the  fact  that  the  Air  Force  cost  for  test  development  was  less  than  that 
proposed  by  the  contractor,  contributed  to  the  decision  not  to  adopt  the 
civilian  contract  proposal. 


VI.  CLOSURE 

The  purpose  of  this  paper  has  been  to  fmailiariza  the  reader  with  the 
contemporary  issues  surrounding  test  development  at  the  USAF  Occupational 
Measurement  Center  and  to  establish  the  emergence  of  four  trends  in  that 
development  function,  those  trends  being  (1)  increased  attention  to  EBOC 
Guidelines,  (2)  increased  interaction  with  the  Air  Force  training  function, 
(3)  an  expansion  of  the  test  development  role,  tnd  (4)  increased  cost- 
effectiveness  in  response  to  our  austere  budgetary  environaent. 


THE  USAF  OCCUPATIONAL  SURVEY  PROGRAM 
by 

James  B.  Keeth 

Military  Occupational  Analyst 
USAF  Occupational  Measurement  Center 
Air  Training  Command 
Lackland  AFB  TX 


Good  Afternoon.  I  would  like  to  present  during  this  part  of  our 
symposium  a  brief  overview  of  the  operational  occupational  survey  program 
that  Is  conducted  by  the  Occupational  Measurement  Center  for  the  United 
States  Air  Force.  The  USAF  Occupational  Survey  Program  actually  began 
back  In  1956  under  the  guidance  of  Or.  Raymond  Chrlstal  of  the  Air  Force 
Human  Resources  Laboratory.  During  the  first  11  years  of  the  program, 
the  primary  emphasis  was  plared  on  development  of  the  methodology  required 
to  conduct  occupational  surveys.  Including  the  computer  programs  necessary 
for  analysis.  This  research  was  conducted  by  Dr.  Chrlstal  and  the  Human 
Resources  Laboratory.  In  1 967.  an  operational  program  was  set  up  at 
Lackland  AFB  under  the  auspices  of  Air  Training  Command.  In  the  early 
years  of  the  program,  15  professional,  technical,  and  clerical  personnel 
conducted  surveys  on  12  enlisted  career  ladders  a  year.  Today,  the 
program  has  grown  substantially,  with  59  assigned  personnel  and  a  capability 
of  conducting  surveys  on  51  career  ladders  annually,  as  well  as  conducting 
officer  surveys,  special  projects,  and  electronic  principles  Inventories. 
Over  the  last  10  years,  we  have  surveyed  over  250  occupational  fields 
Involving  over  500,000  personnel. 

The  occuaptlonal  analysis  program,  as  conducted  hy  the  Occupational 
Survey  Branch,  Involves  four  primary  steps.  The  first  step  Is  the 
development  of  a  job  Inventory  This  consists  of  a  listing  of  those 
tasks  which  may  be  performed  In  a  particular  occupational  field.  The 
second  step  Involves  validating  this  task  list  with  subject  matter 
specialists  working  In  operational  units.  Once  the  task  list  has  been 
validated.  It  Is  administered  vo  the  field.  The  fourth  and  final  step 
consists  of  analyzing  the  data  and  reporting  the  results  to  the  various 
using  agencies. 

In  developing  the  Initial  list  of  tasks  for  the  job  Inventory, 

Inventory  developers  at  the  Center  research  all  pertinent  career  field 
documents  and  publications,  such  as  the  AFR  39*1  Specialty  Descriptions, 
Specialty  Training  Standards  (STSs),  and  Career  Development  Courses 
(CDCs).  In  addition,  previous  tasks  lists  are  reviewed  for  usable 
tasks.  From  this  research,  a  tentative  list  of  tasks  Is  developed. 

During  this  early  stage  of  the  development  process,  the  developer  will 
consult  with  classification  personnel  at  the  Military  Personnel  Center 
at  Randolph  AFB  regarding  any  potential  ci&nges  planned  for  the  career 
ladder  or  other  problems  that  may  be  of  Interest.  In  some  cases,  functional 
managers  at  the  Air  Staff  or  MAJCOM  level  may  be  Involved. 
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Many  times,  the  occupational  survey  report  will  be  the  first  step 
In  a  restructuring  proposal  for  an  occupational  field.  An  example  of 
this  Involved  the  recent  occupational  survey  of  the  computer  systems 
occupational  field.  The  Air  Staff  functional  manager  had  received 
numerous  letters  from  the  MAJCOMs  regarding  weaknesses  In  the  present 
structure  and  possible  poor  utilization  of  some  personnel  In  the  field. 

To  help  In  comlnn  up  with  some  positive  answers  to  these  comments,  an 
occupational  survey  was  requested.  Upon  completion  of  the  survey.  It 
was  Indeed  detetmlned  that  personnel  were  being  utilized  Improperly  In 
some  Instances  and  that  some  restructuring  was  required.  And  based  on 
the  results  of  that  survey,  restructuring  has  been  approved  and  will 
become  effective  next  April. 

Once  the  tentative  task  list  Is  completed,  the  Inventory  developers 
go  out  to  the  field  to  talk  first  hand  with  subject  matter  specialists. 

This  Involves  visiting  the  technical  training  center  where  training  In 
the  specialty  Is  provided.  It  also  Involves  a  visit  to  several  operational 
units  where  subject  matter  specialists  are  actually  doing  the  job.  Once 
It  becomes  evident  that  a  fairly  complete  task  list  has  been  obtained 
from  the  Interviews,  the  task  list  Is  reproduced  and  mailed  out  to  a 
small  sample  of  subject  matter  specialists  In  operational  units  In  the 
CONUS  and  overseas  for  validation  purposes.  Once  the  comments  and 
changes  from  this  field  validation  are  evaluated  and  pertinent  changes, 
additions,  or  deletions  are  made,  the  task  11 it  Is  printed  In  a  final 
form  and  published  as  a  United  States  Air  Force  Job  Inventory. 


Basically,  all  job  Inventories  consist  of  two  sections.  The  first 
section  Is  a  set  of  questions  relating  to  background  characteristics  of 
job  Incumbents  as  well  as  to  work  Information.  Here  we  ask  for  such 
Information  as  name,  grade,  command,  how  long  they  have  been  In  the 
occupational  field,  how  long  they  have  been  In  their  present  job,  and  In 
the  Air  Force.  They  are  alto  asked  about  their  job  satisfaction  and 
reenlistment  Intentions.  In  addition,  we  obtain  Information  about  their 
work  environment,  the  equipment  they  use  or  maintain,  the  types  of 
aircraft  or  weapons  systems  worked  on,  courses  completed,  and  work 
section.  The  amount  of  Information  obtainable  Is  almost  unlimited. 


The  second  section  of  a  job  Inventory  is  simply  a  detailed  listing 
of  tasks  which  may  be  performed.  When  completing  this  section,  job 
Incumbents  check  those  tasks  which  they  perform  In  their  present  job. 
Once  they  complete  this  procedure,  they  go  back  and  rate  each  task 
according  to  the  amount  of  time  they  spend  performing  it  relative  to  all 
other  tasks  being  performed. 


Once  the  job  Inventory  is  published,  the  first  two  steps  in  our 
survey  program  are  completed.  Step  three  Involves  the  administration  of 
the  inventory  to  career  field  members.  Local  consolidated  base  personnel 
offices  receive  the  booklets  and  administer  them  to  personnel  specified 
by  us.  Names  of  job  Incumbents  who  are  to  receive  the  booklet  are 
obtained  from  the  Uniform  Airman  Record  file  provided  by  the  Military 
Personnel  Center.  Our  sample  Is  a  random  stratified  sample  by  skill 
level,  coamand,  and  job  locations.  In  occupational  fields  with  3,000  or 
less  incumbents,  we  survey  the  total  population  of  job  Incumbents  who 
have  been  on  the  job  at  least  six  weeks  or  longer.  Where  there  Is  3,000 
or  more  job  Incumbents,  some  percentage  of  the  total  number  Is  obtained. 
Normally  this  will  range  up  to  30  percent  depending  upon  the  diversity 
of  the  field  being  surveyed. 

.iob  Inventories  are  returned  to  to  the  Center  where  they  are  scanned 
and  reviewed  for  completeness  and  accuracy.  It  is  very  Important  that 
Incorrectly  filled  out  booklets  are  Identified  and  returned  to  the  job 
Incumbent  for  rcaccompllsnment.  Also,  our  survey  analysts  review  the 
returned  booklets  to  make  sure  that  we  are  receiving  returns  from  all 
skill  levels,  comnands,  units,  and  locations.  This  careful  sampling  and 
checking  of  the  returned  booklets  insure  that  the  final  sample  will 
adequately  represent  the  total  occupal tonal  field  population. 

Once  the  returns  are  reviewed  and  accepted  by  the  survey  analyst, 
step  four  of  our  au^lysls  process  begins.  The  booklets  are  scanned  on 
an  optical  scanner  and  the  background  Information  Is  keypunched,  and  the 
data  Is  Input  Into  the  Human  Resources  laboratory's  UNIVAC  1108  computer. 
Using  the  Comprehensive  Occupational  Data  Analysis  Programs,  or  COOAP  as 
It  Is  commonly  known,  the  data  are  analyzed  and  a  final  occupational 
survey  report  is  written.  As  many  of  you  already  know,  COOAP  Is  simply 
a  series  of  highly  complex  computer  programs  that  are  used  to  reduce  the 
large  amount  of  data  obtained  Into  a  more  manageable  form. 

In  analyzing  the  data,  the  central  focus  by  the  survey  analyst  Is 
to  look  at  how  this  occupational  field  Is  structured.  One  unique  strength 
of  COOAP  Is  Its  capability  of  grouping  job  Incumbents  only  on  the  basis 
of  task  similarity  and  producing  what  Is  called  a  cluster  merger  diagram. 

From  this  diagram,  the  analyst  can  determine  the  job  structure  as  It 
actually  exists  out  In  the  field.  This  powerful  tool  allows  us  to  use 
this  information  as  a  foundation  from  which  we  can  look  at  the  career 
field  documents,  such  as  the  AFR  39-1  specialty  descriptions  and  the 
specialty  training  standard,  and  evaluate  If  they  are  accurately  and 
realistically  describing  what  job  Incumbents  are  actually  doing  In  the 
field.  These  documents  are  often  times  based  on  what  is  believed  to  be 
done  In  the  field  and  how  they  are  believed  to  be  structured.  Our  data, 
through  this  grouping  process,  provides  Information  as  to  how  jobs  are 
actually  performed  and  how  they  are  organized.  Often  times  this  preconceived 
idea  of  jobs  does  not  reflect  the  actual  work  performed  (Orisklll, 

1975). 
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In  addition  to  looking  at  job  structure,  the  analysis  phase  can 
provide  other  useful  Information.  It  can  yield  a  comprehensive  listing 
of  tasks  performed  by  personnel  In  the  field.  This  listing  can  be 
broken  down  Into  any  given  group  performing  each  of  the  tasks.  It  can 
show  how  may  people  perform  each  task,  as  well  as  the  percentage  of  time 
spent  by  those  members  who  perform  the  tasks.  In  addition,  task  and  job 
difficulty  data  Is  collected  for  each  occupational  ileld  surveyed. 
Interrater  agreement  Is  very  high  for  this  data,  and  we  only  report  data 
when  agreement  exceeds  .90.  Very  few  sets  of  ratings  have  fallen  below 
this  criterion  to  the  surveys  we  have  conducted  over  the  years  (Orisklll, 
1975). 

Once  our  data  has  been  analyzed  and  an  occupational  survey  report 
has  been  written  and  released,  the  data  is  put  to  use  by  the  Air  Force 
in  a  variety  of  ways.  The  data  Is  often  used  by  classification  personnel 
to  look  at  career  field  structuring.  In  which  the  present  structure  Is 
validated  or  restructuring  Is  recommended.  The  data  Is  also  used  by  our 
SKT  branch  to  aid  In  developing  promotion  tests.  One  of  the  primary 
Issues  today  Involves  having  valid  criteria  on  which  to  base  many  of  the 
tests  administered.  Our  data  can  help  provide  that  criteria.  Or. 
Chrlstal  and  his  staff  at  the  Human  Resource»  Laboratory  also  cse  our 
data  for  personnel  research.  Occupational  data  Is  also  used  in  the 
Instructional  system  development  (ISO)  program  to  analyze  system  require* 
ments,  to  define  education  and  training  requirements,  and  to  conduct 
evaluation  of  Instruction. 

But  perhaps  the  most  Important  use  today  of  our  data  Is  In  determlng 
training  requlrenmnts.  In  todays  environment  where  the  training  dollar 
is  tight.  It  Is  all  too  Important  that  training  be  geared  only  to  what 
the  person  will  need  to  do  his  job  effectively.  In  this  regard,  the 
emphasis  today  Is  placed  on  determining  how  job  incumbents  will  be 
utilized  In  first  job  assignment.  Identify  those  tasks  for  which  the 
probability  of  performance  by  airmen  In  this  first  assignment  Is  high, 
and  provide  initial  training  on  these  tasks.  Therefore,  our  data  is 
useful  In  designing  Initial  courses  that  train  only  for  the  first  job  as 
well  as  providing  valuable  Information  for  what  to  Include  In  follow  on 
training. 

This  then  Is  a  very  brief  look  at  the  Air  Force  occupational  survey 
program.  We  feel  that  the  program  has  great  potential  In  helping  solve 
many  of  the  problems  faced  today  by  the  Air  Force.  Nore  and  more  people 
are  turning  to  us  for  help  and  we  are  taking  steps  to  provide  this 
assistance.  To  help  provide  us  with  the  capability  to  take  on  new 
challenges,  Or.  Chrlstal  and  his  staff  are  continually  providing  us  with 
new  and  Improved  techniques.  All  In  all,  ours  Is  a  growing  program  with 
a  bright  future.  Thank  you  for  your  time. 
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MANAGEMENT  APPLICATIONS  AND  SPECIAL  PROJECTS 

Stanley  D.  Stephenson 
Rodger  D.  Ballentlne 

As  the  USAF  Occupational  Survey  Program  and  the  awareness  of  It  grew,  | 

many  special  one-time  only  requests  from  a  variety  of  agencies  started  being 
received.  These  were  handled  within  the  existing  structure  simply  by  diverting 
resources  from  the  normal  mission  to  accomplish  the  special  request.  A  case 
In  point  was  the  Electronic  Principles  Inventory  (EPI)  whose  development 
was  presented  In  a  paper  at  the  1976  MTA  and  elsewhere. 

In  addition  to  special  requests  officer  surveys  had  not,  for  several 
reasons,  been  a  dominant  factor  In  the  overall  occupational  survey  program. 

8y  1974,  however,  the  normal  program  had  grown  to  the  point  where  we  were 
ready  to  start  surveying  officer  career  areas  on  a  more  permanent  basis. 

By  197S  It  had  become  evident  that  the  special  projects  and  officer 
surveys  were  an  Integral  part  of  the  survey  program  and  that  manning  should 
reflect  this.  The  Officer  Survey  Management  Applications  Section  was 
created  and  staffed  In  1976  to  handle  projects  that  were  above  and  beyond 

'i 

the  normal  enlisted  career  field  occupational  survey  program.  The  section 
was  staffed  primarily  with  experienced  analysts  and  developers;  In  order 
to  achieve  full  manning,  however,  less  experienced  personnel  had 
to  be  added  later.  Among  the  It'.*  experienced  personnel  added  to  the  section 
were  an  educational  specialist  and  a  social  psychologist,  both  of  whom  In¬ 
creased  the  versatility  of  the  section  to  examine  unique  problems.  In  fact 
It  has  become  obvious  that  such  diversity  of  background  .dds  dramatically 
to  the  flexibility  of  the  section. 
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The  first  special  project  undertaken  by  this  uew  section  was  to  define 
the  job  performed  by  weapons  officers  flying  tactical  aircraft.  Tactical 
Air  Command  training  managers  requested  data  to  develop  training  programs 
for  these  officers. 

Several  surveys  were  developed  to  gather  information  about  tasks 

performed,  task  difficulty  and  criticality,  and  job  ^i;c  «'*dges  requiring 

*  ( 

training.  Traditional  interview  development  procedures  were  primarily 
used  but  a  unique  method  was  used  to  validate  the  job  survey  task  lift. 
Rather  than  mall  the  survey  Instrument  to  job  Incumbents  for  validation, 
experienced  wetoons  officers  came  together  with  survey  developers  In 
a  working  conference  to  review  and  validate  the  task  list. 

The  job  *urv<y  was  administered  to  nil  job  Incumbents  and  additional 
tusk  factor  surveys  were  administered  to  smallei  groups  of  expert enced 
weapons  officers.  A  separate  survey  was  also  administered  to  assess  the 
requirement  for  formal  training  of  knowledges  unique  to  the  fighter  weapons 
area.  Survey  returns  were  orccessed  and  analyzed  using  the  Comprehensive 
Occupational  Data  Analysis  Programs  (CODAP)  package.  These  data  provided 
valuable  Insight  inti)  the  jobs  performed  and  training  required  for  weapons 
officers.  Overall,  the  success  of  this  project  proved  that  occupational 
survey  techniques  can  be  adopted  to  meet  the  objectives  of  a  requesting 
agency. 

Of  all  the  experiences  we  gained  from  cur  first  major  special  project, 
two  stand  out  and  have  become  a  part  of  our  "look  i'&%  for"  list.  The  first, 
and  the  more  critical  for  application.  Is  how  the  user  plans  to  make  use 
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of  occupational  data.  At  the  conclusion  of  the  FWIC  project  It  became 
clear  that  the  requesting  command  skply  mbs  not  fully  aware  of  how  much 
Information  can  be  gleaned  from  occupational  survey  data  nor  were  they 
fully  cognizant  of  the  usefulness  of  much  of  the  data.  Consequently  we 
had  to  spend  time  educating  the  user  on  our  project.  To  prevent  this 
situation  from  recurring,  we  now  actively  educate  the  requesting  agency 
starting  with  the  first  meeting.  This  takes  the  form  of  briefings  on 
how  the  data  can  be  used  (often  in  ways  unknown  to  the  user);  Inclusion 
of  user  representatives  during  the  project  life,  especially  during  the 
analysis  stage;  and  an  agreement  on  what  user  office  will  handle  the 
data  and  who  will  be  the  continuity  link  from  start  to  finish. 

The  second  pajor  learning  experience  Involved  the  use  of  operational 
requestor  personnel  In  the  final  validation  of  the  job  Inventory.  This 
greatly  added  to  our  confidence  In  the  job  inventory  and,  In  part,  helps 
overcome  the  uniqueness  of  each  special  project.  It  also  proved  advanta¬ 
geous  In  one  other  way.  By  having  experienced  operational  personnel 
Involved  in  the  final  decisions  about  the  job  Inventory,  It  Increased  the 
sal lability  of  the  Inventory  to  the  survey  respondents  in  the  field  who 
more  often  than  not  had  never  heard  of  the  occupational  survey  program. 
However  they  normally  know  of  the  experienced  personnel  who  helped  In  the 
final  validation,  and  by  citing  their  names  we  greatly  added  to  the 
quality  and  quantity  of  *urvey  returns. 

Besides  the  FNIC  we  are  currently  encaged  In  a  wide  variety  of  other 
projects.  The  EPI,  mentioned  earlier.  Is  now  winding  down  after  receiving 
heavy  emphasis.  For  Instance,  in  April  1977  we  mailed  over  10,700  EPI 
booklets  to  all  those  career  ladders  recel-ing  fP  training  who  had  net  yet 


been  surveyed.  Quite  naturally  this  massive  effort  severely  taxed  our 
capability  and  the  capability  of  the  field.  Nonetheless  we  did  turn  the 
project  around  and  now  have  delivered  EPI  reports  on  approximately  90% 
of  the  career  fields  Involved. 

Several  other  projects  warrant  mentioning  because  of  either  their 
scope  or  uniqueness,  he  are  presently  Involved  In  an  analysis  of  three 
reporting  Identifiers  for  the  Air  Force  Systems  Command.  Reporting  Identifiers 
are  a  system  of  classifying  airmen  who  perform  widely  divergent  tasks;  our 
Involvement  will  truly  test  our  ability  to  describe  job  structure.  The 
Oefense  Intelligence  Agency  has  requested  we  analyze  the  job  performance 
of  their  school  graduates.  This  project  Is  Important  to  us  not  only 
because  It  Is  a  000  agency  but  also  because  It  will  deal  primarily  with 
executive  type  tasks.  Such  will  also  be  the  case  with  our  Professional 
Military  Education  project  for  Air  University.  This  project,  being 
conducted  for  the  Leadership  Management  Development  Center,  will  analyze 
the  management,  leadership,  and  communications  behaviors  performed  by 
both  NCOs  and  officers.  Finally  our  job  analysis  task  list  development 
project  for  the  Federal  Procurement  Institute  Is  Important  to  us  because 
of  Its  scope  (over  20,000  federal  procurement  positions)  and  because  It  Is  a 
federal  agency  versus  an  Air  Force  or  D0D  agency. 

Our  Increased  Involvement  with  executive  type  tasks  coincides  with  the 
growth  of  our  officer  survey  program  which  has  served  to  highlight  tha 
Importance  of  being  able  to  measure  executive  behaviors.  We  have  just 
completed  an  analysis  of  the  Security  Police  career  area  and  are  In  the 


administration  stage  tor  two  other  officer  surveys,  one  for  Weapon  Controllers 
and  one  for  Space  Systems  Analysts.  We  have  found  that  officer  surveys  are 
not  merely  an  extension  of  enlisted  surveys;  rather  they  are  a  process  unique 
of  themselves. 

It  Is  obvious  that  the  Management  Applications  and  Officer  Survey 
Section  has  succeeded  In  Its  stated  purpose.  Perhaps  It  has  succeeded 
too  well,  for  we  find  that  as  the  awareness  of  the  vtlue  of  our  consulting 
grows  we  simply  receive  more  requests  for  assistance.  We  view  this  with 
encouragement  for  It  Is  a  validation  of  how  we  can  assist  In  critical 
management,  classification,  and  training  decisions.  As  we  learn  and  grow 
with  each  project,  we  only  add  to  thaf  capability. 


t 


The  Interface  Between  Occupational  Survey 
and  Test  Construction 
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David  S.  Vaughan,  Capt. ,  USAF 
ATC  Technology  Applications  Center 


The  USAF  Occupational  Measurement  Center  has  two  primary 
missions — conducting  occupational  surveys  and  developing  Specialty 
Knowledge  Tests  (SKTs) .  These  two  missions  are  certainly  distinct; 
in  other  armed  services  they  are  accomplished  by  different  organi¬ 
zations.  However,  the  test  construction  and  occupational  surveying 
activities  within  tie  USAF  Occupational  Measurement  Center  have 
contributed  a  great  deal  to  each  other.  The  purpose  here  is  to 
discuss  sati«2  examples  of  this  cross-fertilization  between  occupational 
surveying  and  test  construction. 

First,  some  examples  will  be  given  of  contributions  which  occu¬ 
pational  surveys  are  making  in  test  construction.  The  Specialty 
Knowledge  Teste  written  at  ♦'he  USAF  Occupational  Measurement  Center 
are  designed  to  measure  job  knowledge  in  particular  Air  Force  job 
specialties.  These  tests  are  developed  by  teams  of  subject-matter 
specialists — senior  NCOe  on  temporary  duty  status — along  with  the 
test  construction  experts  of  the  Center.  Occupational  surveys  provide 
extremely  good  information  concerning  what  airmen  actually  do  in  jach 
of  the  var  ious  job  specialties  and  therefore  can  be  useful  in  deter¬ 
mining  the  topics  to  be  covered  on  Specialty  Knowledge  Tests.  Since 
the  early  days  of  Air  Force  occupational  surveying  the  survey  data 
has  been  made  available  to  test  construction  teams.  This  data 
has  been  useful.  However,  test  construction  teams  found  the  occupa¬ 
tional  survey  reports,  which  contain  a  great  deal  of  information, 
difficult  to  use.  Several  years  ago,  special  occupational  survey 
computer  printouts  were  made  available  for  test  construction.  Those 
printouts  contain  simple  listings  of  all  tasks  in  a  particular  job 
specialty.  Also  printed  are  the  task  difficulties  and  percent  members 
performing  and  percent  time  spent  on  each  task  by  airmen  in  the  target 
population  for  each  test  to  be  constructed.  Such  printouts  present 
the  information  which  is  most  useful  for  test  construction  in  a  compact, 
easy- to- understand  format.  This  printout  format,  which  was  developed 
in  cloee  consultation  between  test  construction  and  oocvpaticnal 
survey  personnel,  has  facilitated  use  of  survey  data  in  test  construction. 

Recently ,  USAF  Occupational  Measurement  Center  personnel  have 
been  working  on  even  more  effective  ways  of  using  occupational  survey 
data  in  test  construction.  This  author  presented  a  paper  at  last  year's 
Military  Testing  Association  conference  (Vaughan,  1976)  concerning  one 
of  these  efforts.  The  goal  of  this  effort,  which  is  still  in  progress, 
is  to  develop  methods  of  automatically  converting  occupational  survey 
data  into  numbers  of  test  items  to  be  written  on  each  topic,  under 
procedures  currently  used  at  the  Center,  the  nutters  of  test  items  to 
be  written  on  each  topic  are  determined  by  the  subject-matter  specialists, 
based  on  the  occupational  survey  data  and  on  their  professional  judgments. 
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Hie  research  hac  used  policy-capturing  methodology  to  find  an  equation 
for  combining  occupational  data  into  mtrbers  of  test  items  to  be  written 
on  groups  of  tasks.  The  data  gathered  so  far  shew  that  subject-matter 
specialists 1  judgments  concerning  lumbers  of  test  items  can  be  predicted 
quite  accurately  from  survey  data.  However,  the  results  also  show 
tl«at  different  equations  are  probably  necessary  in  different  job 
specialties.  Further  research  is  n  oessary  to  investigate  the  useful¬ 
ness  of  a  sinple,  perhaps  unweighted,  equation  across  specialties  and 
concerning  identification  of  groups  of  specialties  which  can  use  cannon 
equations. 

f’SAF  Occupational  Measurement  center  personnel  heve  also  been 
conducting  research  on  another  approach  for  using  occupational  ouTVCy 
data  in  test  construction.  In  this  approach,  occupational  data  is 
used  to  compute  a  testing  priority  index  for  each  task.  Tasks  are 
eliminated  which  are  performed  by  insignificant  lumbers  of  airmen  in 
tie  target  population  of  the  test  to  be  constructed.  Test  construction 
teams  are  given  a  list  of  the  remaining  tasks  in  order  of  testing 
priority.  The  subject-matter  specialists  study  this  list  of  tasks  and 
determine  which  tasks  are  suitable  for  inclusion  on  a  paper- and- 
pencil,  multiple  choice  test.  These  tasks  are  covered  an  the  test, 
and  the  lumbers  of  items  written  on  each  task  are  based  on  the  testing 
priorities  of  the  tusks.  Several  variations  of  tills  method  have  been 
used  with  teut  construction  teams  and  have  been  successful.  However, 
further  research  is  necessary  concerning  algorithm  for  computing 
testing  priority  and  for  removing  tasks  from  the  listing  to  be  given 
to  the  test  construction  teams. 

Both  of  the  methods  for  systematic  use  of  occupational  survey 
data  in  determining  test  content  have  merit.  In  recent  months,  a 
synthesis  of  these  two  methods  has  been  worked  out.  However  ,  this 
synthesis  has  yet  to  be  tested.  Regardless  of  which  method  eventually 
becomes  operational,  more  systematic  use  of  occupational  survey  data 
in  test  construction  will  result  in  a  significant  time  savings  for 
the  test  construe lion  process.  FurUtermore,  test  content  validity 
will  be  mere  syatamrtically  assured. 

The  procedures  just  discussed  use  occupational  mirvey  data  for 
content  validation.  Occupational  survey  mrthodology  is  alx>  being 
used  to  gather  jJo  performance  data.  In  a  project  currently  being 
conducted  by  Center  personnel,  an  occupational  survey  tank  list  is 
being  used  to  gather  supervisory  ratings  of  job  performance  First, 
an  airman  indicates  which  tasks  ho  or  she  performs.  Then,  the  air¬ 
man's  supervisor  rates  the  airman's  performance  on  each  task  performed. 
Hopefully,  these  task  performance  ratings  will  be  leas  contaminated 
with  halo  effects  then  overall  ratings  usually  are  and  therefore  will 
be  more  useful  in  test  validation.  Furthermore,  this  approach  will 
allow  test  scores  to  be  related  to  not  only  how  well  tasks  are  per¬ 
formed  but  to  what  tasks  are  performed  as  welTT  Survey  returns  are 
currently  awaited.  Therefore,  the  success  of  this  procedure  cannot 
yet  be  determined.  However,  it  seems  likely  that  occupational  survey 
methodology  will  make  a  significant  contribution  to  gathering  job 
performance  difca  for  use  in  test  construction. 

Occupational  survey  data  can  be  used  in  validating  individual  test 
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items.  Die  first  step  in  a  procedure  for  item  validation  is  to  identify 
all  tasks  which  relate  to  each  item  on  i  test.  Die  amount  of  experience 
on  tasks  related  to  an  item  can  be  compared  to  the  probability  of  passing 
the  item.  Dus  procedure  has  not  yet  been  implemented  at  the  USAF 
Occupational  Measurement  Center.  However,  preliminary  planning  is  in 
progress  for  a  test  of  this  procedure,  which  is  another  promising 
application  of  occupational  survey  data  in  test  construction. 

Several  important  contributions  which  the  occupational  survey 
program  has  made  in  test  construction  were  outlined  above.  In  return, 
test  construction  has  made  important  contributions  to  the  occupational 
survey  program.  Subject-matter  specialists  play  an  important  role 
both  in  the  development  of  a  task  inventory  and  in  analysis  of  survey 
results.  Subject-matter  specialists  are  an  important  source  of  tasks 
for  an  inventory.  Ordinarily,  Center  inventory  development  personnel 
maxe  extensive;  TOY  trips  to  other  bases  in  order  to  get  input  from 
subject-matter  specialists.  However,  many  subject-matter  specialists 
come  to  the  Center  for  test  construction.  When  scheduling  permits, 
these  test  construction  personnel  are  used  in  developing  task  lists, 
thereby  reducing  the  need  for  Oenter  personnel  to  make  expensive  tripe 
to  other  bases. 

Subject-matter  specialists  can  also  play  an  important  role  in 
the  analysis  and  interpretation  of  an  occupational  survey.  One  step 
in  the  analysis  of  a  survey  is  establishment  of  the  relationship  between 
the  task  list  «nd  important  Air  Force  training  documents  such  as  the 
Specialty  Training  Standard  (STS)  and  the  Plans  of  Instruction  (POIs). 

Use  of  subject-matter  specialists  can  be  an  important  part  of  this  step. 
Again  use  of  subject-matter  specialists  brought  to  the  Center  for  test 
construction  duties  can  result  in  significant  economy  and  efficiency. 
Subject-matter  specialists  can  be  helpful  in  interpretation  of 
occupational  survey  results.  Die  subject -ratter  specialists  on  a 
test  construction  team,  who  have  a  wide  variety  of  backgrounds  and 
experience  in  a  job  specialty,  can  be  particularly  useful  to  survey 
analysts  in  data  interpretation. 

While  occupational  surveying  and  test  development  are  two  distinct 
missions,  the  above  examples  demonstrate  that  these  two  activities  are 
complementary.  At  the  USAF  Occupational  Measurement  Center,  each  has 
benefited  from  the  other.  Because  in  the  Air  Force  these  two  activities 
are  accomplished  within  one  organization,  the  author  believes  that  the 
cross- fertilization  between  occupational  surveying  and  job  knowledge 
testing  has  been  greatly  facilitated. 
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THE  STATE  OF  THE  ART  IN  JOB  TASK  ANALYSIS 


Or  Robert  Pulliam 


INTRODUCTION 


Over  the  past  two  and  one-half  years*  Kinton*  Incorporated  has 
been  working  with  AFHRL  collecting  learning  difficulty  data  on 
selected  tasks  In  various  career  fields  In  the  Air  Force.  These  data 
were  collected  using  field  teams  of  expert  observers  and  task-anchored 
rating  scales.  The  task-anchored  scales  were  developed  and  used  to 
provide  a  "yardstick*  for  measuring  learning  difficulty  across  similar 
Air  Force  career  fields,  using  field  teams  as  the  mode  for  collecting 
the  data.  In  addition  to  the  rating  data,  other  significant  products 
of  this  effort  were  procedural  guides  for  using  the  task-anchored 
rating  scales.  The  emphasis  of  this  paper  will  be  on  the  development 
of  the  task-anchored  scales  and  on  our  success  with  the  use  of  field 
teams.  But  first  I  would  like  to  provide  you  with  some  background 
leading  to  this  effort. 


BACKGROUND 

The  Armed  Services  Vocational  Aptitude  Battery  (ASVAB)  Is  used  to 
assess  the  aptitude  of  new  recruits  In  the  Air  Force,  but  there  has 
been  no  similarly  objective  measure  of  job  aptitude  requirements.  Pre¬ 
liminary  survey  evidence  Indicates  that  some  job  aptitude  requirement 
levels  are  out  of  alignment.  As  a  result,  the  most  talented  applicants 
may  not  be  receiving  assignments  to  the  most  demanding  jobs.  This  in 
turn  could  lead  to  Increased  training  costs,  job  dissatisfaction  and 
fewer  reenlistments.  Thus,  a  requirement  existed  for  research  for  a 
means  to  assess  job  aptitude  requirement  levels  for  tasks,  jobs,  and  job 
types  In  the  Air  Force. 

The  USAF  began  this  research  with  task  Inventories  which  described 
the  job  content  of  career  fields  in  terms  of  specific  tasks  and  other 
quantifiable  characteristics.  Included  were  initial  estimates  of 
task  learning  difficulty.  These  estimates  were  gathered  during  an 
occupational  survey,  and  consist  of  learning  difficulty  ratings  by 
NCOs  for  tasks  In  their  own  specialties,  using  an  adjectival  difficulty 
scale.  These  data  are  useful  as  estimates  of  relative  task  difficulty 
within  specialties,  but  cannot  be  compared  across  career  fields. 

This  then  was  the  premise  for  the  development  of  a  common  scale  by 
which  task  difficulty  both  within  and  across  specialty  areas  can  be 
measured.  To  this  end,  AFHRL  has  sponsored  the  development  of  task- 
anchored  scales  for  electronic  and  mechanical  aptitude  requirements  and 
Is  In  the  process  of  developing  a  task-anchored  scale  for  general  and 
administrative  aptitude  requl resents. 
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SUMMARY  Of  TASK-ANCHORED  SCALE  DEVELOPMENT 


The  objective  of  each  of  these  efforts  was  to  develop  a  25-level 
scale  to  be  usrd  to  rate  tasks  In  terms  of  “learning  difficulty.1* 
Learning  difficulty  is  defined  as  “the  time  required  to  learn  to  per¬ 
form  a  task  satisfactorily." 

Each  scale  was  developed  using  a  three-phase  effort.  The  first 
phase  required  the  selection  of  600  tasks  across  fifteen  (15)  career 
fields.  These  tasks  were  evaluated  by  a  field  team  of  eight  observers 
who  assessed  each  task  at  work  sites.  Each  member  of  the  team  then 
rank-ordered  the  600  tasks  In  terms  of  learning  difficulty.  AFHRL 
used  the  ranking  data  to  generate  a  mean  rank  ordering  of  the  600  tasks. 
This  list  was  divided  Into  25  Intervals,  with  two  tasks  selected  from 
each  Interval  to  form  a  task-anchored  rating  scale.  The  scale  Is  now 
representative  of  both  task  types  and  of  learning  difficulty  across 
the  career  fields. 

The  second  and  third  phases  of  this  effort  involved  the  rating  of 
60  tasks  from  each  of  a  selected  group  of  career  fields.  In  these 
phases,  the  tasks  In  each  career  field  were  assessed  at  work  sites  by 
two  field  teams  of  six  observers  each.  Each  team  member  then  rated 
the  tasks  using  the  25-level  scale  by  comparing  the  task  to  be  rated 
with  tasks  listed  on  the  task-anchored  scale.  These  data  are  ultimately 
to  be  used  by  AFHRL  to  assess  aptitude  requirements  for  entry  Into  the 
sampled  career  fields. 


PROCEDURAL  GUIDES 

It  became  apparent  during  the  assessment  and  rating  process  that 
It  would  be  necessary  to  develop  a  procedural  guide.  In  order  to 
ensure  consistency  between  teams  In  the  criteria  used  to  assess  learn¬ 
ing  difficulty,  and  In  the  methods  for  use  of  the  scale. 

The  guide  was  developed  In  two  parts.  The  first  part  describes 
the  assessment  and  rating  procedures,  and  Is  based  on  guidance  from 
AFHRL  and  on  Klnton's  collective  experience  In  managing  the  assessment 
teams.  It  addresse*  novice  panel  members  who  are  presumed  to  be 
experienced  In  the  USAF  world  of  work,  but  who  have  no  psychometric 
training. 
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The  second  part  provides  a  detailed  explanation  of  each  task, 
focusing  on  the  skills  or  knowledge  which  must  be  learned  to  perform 
It,  The  guide  was  developed  after  detailed  study  of  each  task, 
principally  with  the  aid  of  subject  matter  experts  (SMEs)  at  USAF 
technical  schools.  Detailed  descriptions  were  prepared  for  each  task 
on  the  scale,  noting  conditions  of  performance,  criterion  standards, 
specific  skill  or  knowledge  required,  and  any  circumstances  which  tended 
to  mitigate  or  Increase  the  difficulty  of  learning  the  task. 

The  procedural  guide  was  validated  by  a  field  test  using  three 
teams  to  rate  60  tasks  In  each  of  four  career  fields  (two  electronic 
and  two  mechanical).  One  team  had  prior  experience  using  the  task- 
anchored  scales  and  rated  the  tasks  In  each  of  the  four  career  fields. 
The  other  teams  consisted  of  persons  who  had  no  prior  experience  In 
using  the  scale,  but  a  broad  general  knowledge  of  the  USAF  world  of 
work  and  specific  competence  In  mechanical  or  electronic  career 
fields  (supervisor  equivalence).  One  of  these  teams  rated  the  tasks 
in  the  two  electronic  fields,  the  other  rated  the  tasks  In  the  two 
mechanical  fields. 

The  rating  data  from  the  novice  teams  shows  an  Interrater 
reliability  coefficient  of  from  .940  to  .960.  and  a  coefficient  of 
from  .910  to  .950  for  the  experienced  team  across  the  four  career 
fields.  The  two  teams  as  a  group  had  reliability  coefficients  of  .97 
to  .94  across  the  four  career  fields.  We  Interpret  these  data  as 
suggesting  that  the  task-anchored  scale  procedure  is  a  reliable 
method,  and  that  the  Procedural  Guides  are  effective  means  to  replicate 
the  method. 


USE  OF  FIELD  TEAMS 


One  or  more  field  teams  were  used  In  each  phase  of  a  task-anchored 
scale  development.  Each  team  typically  consisted  of  from  six  to  eight 
ex-military  personnel.  The  majority  of  the  team  members  were  retired 
Air  Force  NCOs  with  experience  In  one  or  more  of  the  career  fields  being 
sampled.  Additional  conditions  for  selection  Included:  a  requirement 
to  have  worked  as  a  supervisor  of  journeymen  end  to  have  had  five  or 
more  years*  experience  In  the  military.  Thus,  the  field  teams  represented 
a  high  level  of  Air  Force/ml 1 1 tary  background,  plus  hands-on  and  super¬ 
visory  experience  In  one  or  more  of  the  career  fields  being  assessed. 
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Each  team  assessed  the  task  sampling  as  a  group  through  visits  to 
appropriate  Air  Force  bases.  Time  constraints  precluded  the  physical 
observation  of  each  task;  thus,  the  team  evolved  an  Interview  tech¬ 
nique  with  tours  of  the  work  facilities  for  task  assessment.  The 
Interviews  were  conducted  as  a  group  with  two  or  more  personnel  from 
each  career  field  with  each  member  of  the  team  encouraged  to  ask 
questions.  The  Interviews  were  designed  to  Identify  those  factors 
which  affect  the  learning  difficulty  of  each  task  and  to  determine  to 
what  extent  these  factors  play  a  part  In  the  learning  of  that  task. 
After  completion  of  the  Interview  and  tour  of  the  facilities,  the  teams 
would  typically  reconvene  and  discuss  the  Information  gethered  at  the 
Interview.  These  discussions  were  used  to  ensure  that  the  boundary 
conditions  cf  each  task  are  defined,  that  all  of  the  factors  have  been 
accounted  for,  and  that  any  biases  of  the  personnel  Interviewed  have 
been  taken  Into  account,  lengthy  and  often  heated  discussions  occurred 
at  these  sessions  as  the  biases  of  the  Individual  panelists  also  became 
known.  At  no  time  were  the  actual  ratings  of  the  tasks  to  be  discussed 
at  these  meetings. 


The  ratings  were  then  performed  Independently  by  each  team  member 
the  same  day  as  the  Interview  was  performed.  Data  to  date,  using  the 
scales,  Indicate  a  median  Interrater  reliability  coefficient  of  .955 
on  over  1500  tasks  across  35  electronic  career  fields  and  .940  coeffl- 

I  clent  on  over  1500  tasks  across  38  mechanical  career  fields. 
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Conclusions  on  the  Use  of  Field  Teams 
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Our  experience  Indicates  that  the  advantages  of  using  field  teams 
to  collect  data  of  the  type  described  far  outweighs  the  disadvantages. 
The  most  significant  disadvantages  are  the  costs  of  a  team  or  teams 
for  travel  and  the  problem  of  consistently  maintaining  an  experienced 
team  over  long  periods  of  time. 

The  advantages  are  that  the  data  are  collected  at  the  source  by 
actual  understanding  of  the  task  with  observation  of  the  equipment  and 
worksite.  The  team  members  not  only  get  a  worksite  Interpretation  of 
the  task  through  the  Interviews,  but  also  often  observe  or  participate 
In  actual  task  performance.  For  example,  one  team  spent  a  day  In  a 
missile  silo  observing  demonstrations  of  task  performance.  In  addition, 
various  team  members  have  operated  devices  from  flight  simulators  to  an 
electrical  lineman  cherry  picker.  Additional  advantages  accrue  when 
two  teams  are  fielded  In  that  data  are  collected  from  two  different 
sites  In  each  career  field. 
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The  success  In  using  field  teams  on  these  projects  is  a  high 
recommendation  for  the  use  of  teams  In  future  efforts  to  collect 
human  factors  data. 
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The  United  States  Navy’s  Chief  of  Naval  Education  and  Training 
(CNET)  Intends,  Insofar  as  possible,  to  deliver  a  trained  man  for 
every  billet  at  minimum  cost.  A  major  obstacle  to  this  objective  is 
that  required  training  decision  and  development  information  Is  not 
available  In  a  manageable  form  and  timely  manner.  Needed  Is  not  only 
training  objective  data,  but  human  management  data  that  will  support 
decisions  and  development  processes  about  who,  when,  where,  what,  and 
how  to  train— decisions  and  processes  Implied  In  the  Interservlr.e 
Instructional  Systems  Development  (ISO)  Model.  For  this  purpore  the 
Navy  fuss  embarked  on  the  design  of  a  sophisticated  Information  system, 
l.e.,  the  Naval  Enlisted  Professional  Development  Information  System 
(NEPI/IS). 

This  paper  will  present  a  concept  upon  which  the  CNET  Is  dependent 
If  full  realization  of  the  Naval  Education  and  Training  Command's 
(NAVEDTRACOM's)  ISO  effort  Is  going  to  be  realized.  The  Navy  In  this 
author's  opinion,  cannot  expect  to  exploit  ISD  to  Its  fullest  potential 
unless  an  Information  system,  such  as  NEPDIS,  Is  developed. 


PROBUM 

Like  the  other  services,  the  Navy  has  adopted  and  Is  In  the 
process  of  Implementing  ISD  -  a  $y« nematic  method  of  designing,  develop¬ 
ing,  Implementing  and  evaluating  the  total  learning  process  In  terms 
of  specific  objectives  of  the  learner.  ISD  represents  a  major  advance¬ 
ment  In  training  technology  and  all  services  have  the  potential  of 
realizing  great  benefits  by  Its  Implementation.  However,  ISD  as  It 
presently  exists  In  the  services,  represents  a  "what  should  be  done" 
policy.  It  does  not.  In  most  Instances,  specify  how  each  phase  or  step 
Is  to  be  accomplished. 

The  paramount  and  fundamental  problem  ISD  presents  the  Navy  Is 
one  of  data  management.  The  CNET  Is  responsible  for  the  design, 
development,  management,  and  evaluation  of  2,500  enlisted  courses  of 
Instruction.  Following  the  ISD  model,  these  2,500  courses  Involve 
approximately  7,000,000  tasks,  10,000,000  terminal  learning  objectives 
(TLOs),  11,000,000  TIS  criterion  referenced  performance  measures 
(CRPMs),  20,000,000  enabling  learning  objectives  (ELOs)  and  22,000,000 
ELO  CRPMs.  Not  only  does  ISO  result  In  the  generation  of  this  data, 
but  it  necessitates  access  to  this  data  on  a  recurring  basis.  For 
example,  step  4  of  phase  I  of  the  ISD  wjdel  dictates  that  whenever  a 
training  requirement  Is  Identified,  not  only  all  Navy  but  all  DOD 
training  will  be  Investigated  to  determine  if  there  Is  au  existing 
course  or  portion  of  a  course  that  wi?T  satisfy  the  training  requirement. 
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If  one  does  exist,  a  tremendous  training  development  savings  could  be 
realized  and  redundant  training  avoided.  But  how  does  one  conduct 
such  an  Investigation?  One  approach  would  be  to  search  existing 
training  by  learning  objectives  and/or  tasks.  Even  if  the  data  were 
accessible,  could  an  adequate  search  be  conducted  manually  In  a  cost- 
effective  manner?  Probably  not. 

Most  of  the  effort  that  has  gone  Into  Naval  course  development  In 
the  past  ten  years  cannot  be  re-identified  In  any  orderly  fashion. 

That  is  to  say  that  It  Is  not  known  specifically  which  tasks  are 
trained  in  which  courses.  Additionally,  an  audit  trail  (Implied  in  ISO) 
Is  lacking,  through  which  tasks  might  be  traced  from  stated  training 
requirements,  through  design  and  development,  coverage  in  a  course, 
and  graduate  performance  In  the  fleet.... then  back  to  analysis  for  any 
reason.  It  Is  highly  likely  that  training  to  specific  tasks  appears 
many  times  In  many  curricula. ...and  this  Is  probably  necessary  In  some 
Instances,  but  probably  an  unnecessary  redundancy  In  others.  If  the 
Navy  Is  going  to  manage  ISO  and  to  employ  what  It  has  to  offer  In  a 
highly  effective  manner,  it  must  have  an  audit  trail  through  each 
existing  curriculum,  with  each  element  of  that  curriculum  linked  to 
all  others,  so  that  the  data  can  be  located  for  comparison  and  update. 

The  responsibilities  for  CNET  to  develop  and  maintain  courses 
continue  to  mount.  Navy  studies  have  recommended  cutting  back  on 
I  the  numbers  of  NECs.  This  would  have  made  training  more  feasible  and 

t  more  affordable  by  permitting  core  (or  comnon)  courses,  feeding  specific 

or  finger  courses  In  support  of  NECs  which  have  common  skill  require¬ 
ments.  But  Instead  of  decreasing  NECs,  the  Navy  has  gone  from  677  In 
'  1968  to  1179  In  January  1976.  This  number  of  NECs  may  be  necessary 

and  based  upon  valid  requirements,  but  tremendous  savings  might  have 
been  realized  In  course  development  costs  had  It  been  possible  to 
*  Identify  the  common  skill  requirements  among  related  NECs  and  to  com¬ 

bine  them  into  few  core  courses.  There  Is  little  need  here  to  address 
?  diminishing  resources;  this  Is  a  common  condition  today  In  civilian 

‘  organizations,  as  well  as  in  DOD.  What  must  be  done,  then.  Is  to 

I  manage  ever-diminishing  resources  better,  so  that  It  is  possible  to 

respond  to  ever-increasing  responsibilities  with  logic,  based  upen 
<  quantified  data. 


RESOLUTION 

I  believe  that  It  Is  essential  that  an  automated  system  such  as 
NEPDIS  be  developed,  to  provide  MAVEDTRACOM  with  the  resources  to 
exploit  the  opportunities  of  the  ISO  model  fully  and  to  analyze  the 
multitude  of  required  data,  critical  to  training  development  and  In 
order  to: 

o  eliminate  unnecessary  training; 

o  achieve  reductions  In  both  training  development  and  actual 
training  time; 

o  coordinate  the  development  of  training  to  eliminate  duplication 
of  training  development,  as  well  as  actual  training  activities; 

o  Increase  the  efficiency  of  training  development  activities; 

o  develop  the  most  cost-effective  career  training  ladders  for 
enlisted  personnel; 

o  efficiently  evaluate  work  efforts  and  sequential  output 
material  In  order  to  discover  and  correct  deficiencies  In 
a  cost-effective  manner; 

o  ascertain  the  actual  cost  associated  with  the  development  and 
conduct  of  training". 


NEPPIS  SUBSYSTEMS 

To  satisfy  the  objectives  for  which  NEPDIS  was  conceived,  five 
subsystems  or  processes  are  required,  as  illustrated  In  Table  1. 
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NEPDIS  SUBSYSTEMS 


The  Training  Development  Subsystem  will  provide  a  means  to  store, 
manage,  and  quantify  job  data  In  a  manner  that  will  enable  the  Identi¬ 
fication  of  commonality,  skill  levels,  and  complexities.  This  sub¬ 
system  mill  also  ealntain  a  record  of  all  training  development 
activities  Including  their  current  status,  decisions,  and  the  Indi¬ 
vidual  or  agency  responsible. 

The  Instruction  Subsystem  mill  provide  the  means  to  record  all 
Naval  enlisted  Instructional  programs  including  non-resident  and  OBT 
as  well  as  resident  programs,  developed  and  presently  being  developed. 
All  training  material  and  literature,  as  well  as  the  criterion 
referenced  performance  measurements  associated  with  each  Instructional 
program,  will  be  Included  In  these  records. 
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The  Training  Record  ar.d  Evaluation  Subsystem  will  support  internal 
and  external  training  evaluation  by  providing  a  flexible  and  readily 
accessible  means  for  storing  and  analyzing  evaluative  data,  as  well 
as  supporting  the  selection  of  study  samples  and  providing  a  means  to 
access  sample  populations.  It  will  maintain  a  record  of  all  Naval 
training  (resident,  non-resident,  and  08T)  acquired  by  Navy  enlisted 
personnel,  as  well  as  a  record  of  skills  and  knowledge  obtained  by 
traditional  and  ncn-tradltlcna!  civilian  education,  vocational  schools, 
and  so  forth.  Criterion  referenced  performance  testing  results  for 
each  training  or  education  course  completed  will  be  maintained  and 
readily  accessible  for  training  evaluation. 

The  Career  Development  Subsystem  will  provide  a  means  for  Identi¬ 
fying  Naval  enlisted  career  ladders.  It  will  reflect  the  position  of 
an  Individual  within  a  given  career  ladder,  and  of  the  career  options 
open  to  each  sailor.  In  the  process,  this  subsystem  will  Identify, 
for  the  Navy,  the  most  cost  effective  career  paths  for  enlisted 
personnel. 

The  Audit  Subsystem  will  provide  the  means  to  manage  the  entire 
training  development  system  by  ascertaining  the  full  Impact  of  changes 
external  to  the  training  community  (hardware  modifications,  operating 
practice,  doctrine,  etc.),  as  well  as  the  Internal  Impact  of  training 
decisions.  This  subsystem  will  provide  an  automatic  alert  to  pro¬ 
ponents  of  training  activities,  when  those  activities  are  Impacted 
either  by  changes  in  the  manpower  requirements  or  In  systems  hardware. 


NEPDIS  DATA  BASE 


The  NEPDIS  data  base  described  represents  a  conceptual  design 
only.  No  doubt  several  revisions  of  this  data  base  will  occur  during 
the  development  of  NEPDIS.  The  paradigm  below  Illustrates  the  NEPDIS 
data  base  as  It  Is  presently  conceived.  It  should  also  be  understood 
that,  although  on-line  storage  symbols  have  been  used  to  denote  or 
Illustrate  these  files,  it  Is  not  the  intention  at  this  time  to 
specify  the  choice  between  on-line  or  off-line  (batch)  storage  or 
processing.  These  symbols  are  used  merely  to  Identify  files  -  not 
storage  or  processing  modes. 


TABLE  2 


NEPDIS 

SUBSYSTEMS  *  DATA  BASE 


The  Task  Inventory  File  will  maintain  a  record  of  all  enlisted 
task  Inventories  with  supportive  data  that  will  permit  the  identifi¬ 
cation  of  task  commonality,  complexity,  and  skills.  "Skills"  in  this 
context  would  Include  enabling  mental  and  physical  skills  js  well  as 
terminal  physical  skills. 

The  Training  Development  Management  File  will  maintain  a  record 
of  all  training  development  activities  including  data  such  as: 

(1)  the  Identification  of  a  task 

(2)  the  decision  whether  or  not  to  develop  training  (for  a  task) 

(3)  the  method  of  training  (resident,  non-resident,  or  OBT) 
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(4)  training  design  and  development 

(5)  Instructional  programs 

(6)  training  literature  and  materials 

(7)  performance  measures 

(8)  projected  and  actual  training  development  milestones 

(9)  the  organization  and  Individual  responsible  for  training 
development  activity 

(10)  the  current  status  of  a  given  training  development  activity 

This  file  can  be  viewed  as  the  catalyst  for  an  automated  training 
development  PERT. 

The  Instructional  Program  File  will  maintain  a  record  of  all 
Naval  enlisted  resident,  non-resident,  and  OBT  training  programs 
developed  or  currently  being  developed,  for  each  program.  It  will 
record  the  associated  learning  objectives,  performance  measures,  and 
a  course  synopsis.  This  file  will  provide  for  an  audit  trail  that 
can  link  Instructional  programs  to  tasks,  training  literature  and 
materials,  and  career  ladders.  This  file  will  provide  support  not 
only  to  the  training  development  community,  but  to  the  schools  as 
well.  It  can  be  viewed  as  an  ERIC  for  the  Navy  enlisted  schools. 

The  Training  Material  and  Literature  File  will  include  a  record 
of  all  training  literature  (instructor  guides,  programmed  Instruction 
texts,  rate  training  manuals,  etc.)  and  training  materials  (films, 
video  tapes,  audio  cassettes,  vlewgraphs,  slides,  etc.)  whether 
developed  or  presently  under  development,  for  Naval  enlisted  training 
programs.  This  file  too  would  be  conducive  to  an  audit  trail  between 
training  literature  and  material  to  tasks,  learning  objectives, 
performance  measures,  and  so  forth.  It  will  support  not  only  the 
training  development  community  but  the  Instructors  as  well  -  an  ERIC 
for  school  staff. 

The  Training  and  Education  File  will  provide  a  central,  compre¬ 
hensive  education  and  training  record  for  all  Naval  enlisted 
personnel.  This  file  will  Include  biographic  data,  current  assign¬ 
ment,  high  school,  college,  and  vocational  education  courses  com¬ 
pleted,  as  well  as  degrees  or  diplomas.  It  will  record  USAFI  and 
DANTES  courses,  and  formal  Naval  training  (resident,  non-resident, 
and  OBT)  to  include  the  results  of  criterion  referenced  performance 
measures.  Equivalency  examinations  and  scores  will  also  be  Included. 
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A  segment  of  each  individual's  training  record  will  be  reserved  for 
training  evaluation  data.  This  portion  of  the  records  will  be  used 
by  the  CNET,  N5  for  depositing  training  evaluation  data  for  analytic 
purposes.  Once  an  analysis  Is  completed,  this  data  would  be  deleted 
and  the  file  segment  could  be  used  once  again  for  the  same  purpose. 
This  will  enable  CNET,  N8  to  have  ready  access  to  the  entire  Naval 
enlisted  population  for  purposes  of  training  evaluation. 

The  Training  Evaluation  File  will  be  a  flexible,  multipurpose 
working  file,  designed  so  that  most  training  evaluation  data  can  be 
stored  In  It  and  evaluative  analyses  performed  against  It.  This  file 
should  not  be  confused  with  the  Training  and  Education  File.  The 
Training  and  Education  File  will  enable  training  evaluators  to  access 
and  store  Information  about  specific  Individuals.  The  Training 
Evaluation  File  will  be  used  as  a  depository  of  Information  extracted 
from  the  Training  and  Education  File.  Evaluative  analyses  can  then 
be  performed  on  data  from  the  Training  Evaluation  File  In  a  much  more 
efficient  a.id  cost-effective  manner  than  would  be  possible  using  the 
Training  and  Education  File,  due  to  Its  volume. 

The  Career  Ladder  File  will  record  all  Naval  enlisted  career 
ladders  with  a  capahil 1 ty  to  Identify  the  grades  associated  with  each 
career  ladder  step,  the  core  and  finger  training  required  to  achieve 
each  of  those  grades,  and  Information  concerning  where  training  may 
be  acquired  and  when  during  a  given  career  continuum  that  training 
1$  appropriate. 


SUMMARY 

The  NEPDIS  Is  achievable.  It  will  pay  for  Itself.  Furthermore, 

In  the  long  term  It  will  be  unavoidably  necessary.  Very  soon  the 
Navy  win  not  be  able  to  keep  up  with  technology,  or  to  man  a  com¬ 
petent  Naval  establishment  by  using  existing  ad-hoc  methods  to  decide 
what  will  be  taught,  to  whom,  and  when.  Tomorrow's  Navy  will  no  more 
be  able  to  Identify  Its  training  needs  through  use  of  human  judgment 
and  hand-processed  paperwork  than  It  could  manage  its  logistic  system 
by  those  same  methods.  The  problems  are  similar.  Every  billet  In 
every  ship  generates  a  continuing  und  unique  requirement  for  electronic 
parts,  paint,  pipe  and  paper  clips.  When  those  do  not  arrive  on  time, 
the  systems  go  down;  If  too  much  is  ordered  or  stockpiled,  the  costs 
skyrocket.  Those  same  billets  each  require  unique  Inventories  of 
skills,  knowledge,  and  experience,  yet  training  developers  and 
designers  have  never  really  been  able  to  meet  those  requirements, 
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except  by  guesswork.  Large  automated  systems  have  been  developed 
to  support  the  logistics  community,  but  so  far.  no  such  resources 
have  been  applied  to  the  equally  complex,  important,  and  costly 
activities  of  training. 
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VALIDATION  OF  CLASS  'A*  SChOOL 
TRAINING  AGAINST  JOb  COWTLNT 


Richard  S.  Lan  tertian 
U  S  Coast  ^uard  Headquarters 
G-P-l/2/62 

Washington,  DC  20590 


David  A.  Bov.nas 

fersonnel  Decisions  Research  institute 
821  Marquette  Avenue 
Foshay  Tower 
Minneapolis,  Ml  55402 


Content  validation  is  a  process  for  showing  that  some 
procedure  (a  test,  wcrk  sample,  training  course,  etc.)  Is  a 
representative  sample  of  a  relevant  domain  of  activities.  In 
this  study,  cur  object  is  ’/i  develop  i  procedure  for  determining 
If  the  material  presented  in  a  Coast  Guard  A  school  covers  the 
Information  required  to  perforr,  a  representative  sample  of  the 
domain  cf  tasks  defining  the  corresponding  Job  (rating). 

The  first  step  in  the  process  is  to  Identify  the  target 
dcmaln  of  tasks.  This  was  relatively  simple?  since  the  Coast 
Guard  has  recently  completeed  task  analysis  surveys  for  the  three 
ratings  we  are  considering.  We  will  use  the  task  lists  emerging 
from  these  surveys  to  represent  the  task  don* Ins  for  the  three 
ratings,  although  ve  will  take  son*  checks  to  ensure  that  the 
lists  are  up-to-date.  The  information  available  for  each  task 
includes  the  proportion  of  personnel  performing  it,  the  relative 
tine  spent  per  task,  and  its  average  difficulty  ratine.  In 
addition,  we  will  survey  a  snail  sample  of  personnel  from  each 
idtiny  tc  obtain  estimates  of  each  task's  criticality  (the  threat 
to  the  overall  mission  If  the  task  is  not  correctly  performed), 
atd  whether  each  task  is  performed  at  a  helper  or  doer  level. 

Our  current  plan  Is  to  combine  the  criticality  and  time 
spent  data  to  obtain  an  overall  importance  index  for  ee  :(i  task. 
This  will  prcbalbly  be  dene  by  either  adding  or  multiplying  the 
time  spent  and  critical Uy  ratings,  producing  an  importance  index 
where  either  high  criticality  or  high  time  spent  can  result  In  a 
moderate  to  high  overall  importance  value. 

In  the  next  research  step,  we  will  identify  the  class  A 
school  training  content.  We  hope  to  approximately  fifty 

(50)  curriculum  elements  or  training  topics  that  are  relatively 
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discrete  and  internally  hcn^geneous  fcr  each  rating.  The  school 
fcr  one  rating  seer..s  to  be  broken  down  rather  nicely  already, 
with  fifty- three  (53)  substantive  topics  covered.  The  other  two 
are  less  than  ioeal,  with  approximately  twenty-five  (25)  broad 
weekly  topics,  and  approximately  one  hundred  (100)  specific 
topics  at  the  next  level.  We  are  currently  meeting  with 
instructors  to  identify  fewer  elements  at  an  intermediate  level 
of  specificity.  In  identifying  these  curriculum  elements,  we 
will  consider  the  Coast  fuard  Unlisted  Qualifications  Manual, 
course  curriculum,  outlines,  and  suggestions  frcm  class  A  school 
training  personnel. 

After  school  content  has  been  specified,  the  validation  of 
course  content  will  be  accomplished  by  mapping  the  curriculum 
against  the  task  to  determine  whether  the  most  important  tasks 
are  receiving  the  rest  emphasis  in  training.  This  mapping  will 
be  accomplished  through  a  series  of  judgments. 

Class  A  school  instructors  will  rmake  three  evaluations. 
First,  they  will  estimate  the  extent  to  which  performance  on  each 
task  is  emphasised  during  formal  training.  Next,  they  will 
identify  the  three  or  four  curriculum  elenerts  in  which  each  task 
is  predominately  trained.  Finally,  they  will  indicate  how  much 
emphasis  each  curriculum  element  receives  during  training, 
primarily  as  a  function  of  the  ar.ount  of  tine  spent  on  each  topic 
or  elerent. 

A  sample  of  recent  class  A  school  graduates  will  make  an 
additional  set  of  evaluations  to  serve  as  a  reliability  check  on 
the  instructor's  data.  Since  instructors  may  have  a  tendency  to 
overestimate  what  is  taught  in  their  courses,  we  will  ask  recent 
graduates  to  indicate  how  well  they  can  currently  perform  each 
task.  We  will  compare  their  nean  ratings  with  instructors' 
ratings  of  tne  emphasis  given  each  task  to  determine  whether  some 
tasks  ray  actually  be  receiving  less  attention  ir  training  then 
the  instructors  believe.  Where  such  discrepancies  occur,  we  will 
request  some  substantiation  of  the  instructor  ratings. 

Class  A  school  records  will  provide  a  second  source  of 
intonation  atout  curriculum  elenent  emphasis.  Results  of  weekly 
examinations,  and  final  course  results  are  recorded  on  Student 
Record  Forms  for  each  student,  by  analyzing  these,  we  can 
determine  empirically  how  a  curriculum  elerent,  or  a  group  of 
curriculum  elements  for  a  week,  contributes  to  a  final  course 
grade,  and  hence  to  whether  a  student  is  promoted  into  a 
specialty.  We  will  compare  these  results  with  instructors' 
ratings  of  curriculun  element  enphasis  to  determine  whether 
elements  are  weighted  in  the  final  course  grade  in  the  same  way 
that  they  are  emphasized  in  class,  both  types  of  element  weight 
estimates  will  be  ctnsidered  in  evaluating  course  content 
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validity. 


Training  content  validity  Kill  be  assessed  from  these  data 
in  three  steps.  First,  each  task  will  be  “assigned"  to  the 
curriculum  elements  where  Instructors  fe.*l  it  is  primarily 
taught.  A  task  roay  oe  apportioned  across  several  curriculum 
elements  according  to  the  extent  to  which  It  is  taught  in  each 
element.  The  result  will  be  a  matrix  of  curriculum  elements  by 
tasks,  Indicating  the  extent  to  which  performance  on  each  task  is 
tjught  in  each  curriculum  element. 

In  the  second  step,  the  matrix  entries  under  each  curriculum 
element  will  be  weighted  by  the  curriculum  element's  er.phasis 
(either  the  enphasis  it  receives  in  training,  or  its  weight  ir. 
determining  final  ccurse  scores). 

Finally,  these  weighted  matrix  values  will  be  sunned  across 
all  fifty  (50)  curriculum  elements  for  each  task,  to  obtain  one 
index  of  how  r.uch  enphasls  each  task  receives  in  the  training, 
and  one  estir^te  of  how  much  each  task  is  weighted  in  the  final 
course  crade. 

These  two  task  training  profiles  (across  the  five  hundred 
(500)  or  so  tasks  In  each  ratine)  will  be  correlated  with  task 
importance  data  from  the  task  analysis  survey,  to  ascertain 
whether  the  task  training,  emphasis  profiles  match  the  task 
importance  profile.  For  ratings  where  this  correlation  is  high, 
we  will  conclude  that  the  tasks  emphasized  in  course  content  are 
the  sar-e  as  these  mest  important,  for  successful  job  performance. 
When  the  correlations  are  low,  we  will  compute  variances  betv/een 
training  emphasis  values  and  importance  values  to  identify  tasks 
that  appear  to  be  over  or  under-emphasized  ir  training.  Lists  of 
such  tasks  will  be  provided  to  curriculum  developers,  sc  they  can 
decide  whether  the  course  content  should  be  changed  to  reflect 
task  importance  nore  closely,  or  whether  there  is  some  additional 
factor  vhich  warrants  the  unusual  emphasis  given  these  particular 
tasks. 

Although  the  need  for  Insuring  the  validity  of  training 
content  has  been  stressed  for  sene  tine,  and  althcugh  informal 
procedures  exist  for  developing  course  curricula  from  task  lists 
or  Qualification  Manuals,  we  believe  this  is  one  cf  the  first 
procedures  for  quantitatively  evaluating  the  content  validity  of 
an  existing  training  curriculum;  and  we  believe  the  methodology 
will  provide  a  valuable  contribution  to  the  area  cf  personnel 
training  and  evaluation. 
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FINDING  A  SYSTEMATIC  METHOD  FOR  DERIVING  OBJECTIVES 
FOR  FOREIGN  LANGUAGE  TRAINING 

Francis  A.  Cartier 

Defense  Language  Institute  Foreign  Language  Center 

The  views  of  the  author  of  this  paper  do  not  purport 
to  reflect  the  position  of  the  Army  or  the  Department 
of  Defense. 

The  first  principle  of  instructional  systems  design  requires 
analysis  of  the  duties  and  talks  that  the  graduate  of  the  train¬ 
ing  will  actually  have  to  perform.  From  the  information 
gathered  in  that  analysis,  the  objectives  and  criteria  for  the 
training  are  determined,  and  all  other  system  development 
processes,  including  the  tents,  are  targeted  specifically  on 
those  objectives  and  criteria. 

In  recent  years,  the  Services  have  performed  job  and  task 
analyses  for  a  great  number  of  military  jobs  and  acceptable 
procedures  have  been  worked  out  for  tabulating  the  analytical 
information  and  developing  objectives  and  criteria  for  the 
necessary  training.  As  a  result,  the  training  has  been  made 
more  job- relevant  and  a  great  deal  of  training  time, 
previously  devoted  to  irrelevant  knowledge  and  skills,  has 
been  eliminated. 

The  Defense  Language  institute  has  been  exploring  the 
applications  of  instructional  systems  theory  to  foreign- 
language  training  since  about  1971  and  has  been  thoroughly 
committed  to  it  for  about  two  years. 
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However,  from  the  very  beginning  of  this  effort,  certain 
fundamental  problems  arose  which  are  yet  to  be  solved. 

This  paper  will  not  present  solutions  to  these  problems, 
but  will  describe  some  of  them  and  discuss  our  effort  to  find 
solutions  to  them. 

Our  first  difriculty  in  converting  to  instructional  syu terns 
design  is  the  great  size  and  variety  of  the  Oofense  Foreign 
Language  Program.  We  teach  about  120  different  courses  at  the 
Foreign  Language  Center,  in  about  30  different  languages. 

1  say  "about"  in  both  cases  becauLe  the  requirements  change 
from  year  to  year.  Many  of  these  courses  are  nearly  a  year 
in  duration. 

The  second  difficulty  arises  from  the  fact  that  the  students 
in  some  of  our  longest  courses— the  Basic  Courses— are  not 
all  being  trained  for  the  same  type  of  job.  DL1  Basic  Courses 
do  not  train  personnel  for  any  particular  job,  but  train  them 
to  perform  part  or  all  of  their  jobs  in  a  foreign  language.  As 
a  Defense  school,  wo  have  students  from  tho  Army,  Navy,  Air 
Foret  and  Karine  Corps,  all  of  which  have  slxghtly  different 
ways  of  identifying  the  jobs  and  stating  the  requirements. 

We  have,  therefore,  an  enormous  number  and  variety  of  jobs 
to  analyze.  Practical  considerations  make  it  necessary  for 
us  to  trade  off  some  precision  in  targetinq  the  instruction 
to  the  individual  job  in  favor  of  generalizing  the  objectives 
of  training  for  optimum  applicability  to  several  student 
populations. 
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Fortunately,  we  do  not  bear  the  responsibility  for  job 
and  task  analysis  for  each  and  every  one  of  these  students. 

The  great  majority  of  our  trainees  come  to  us  from  the  security 
services.  After  several  years  of  cooperation  with  the  National 
Security  Agency  and  the  cryptological  training  systems  of  the 
Army,  Navy  and  Air  Force,  DLi  is  now  receiving  fairly  precise 
information  about  the  jobs  under  their  control.  There  are  still 
some  unanswered  questions  in  this  job  field,  out  these  questions 
are  gradually  being  worked  out.  The  result  is  that  we  are 
getting  acceptable  terminal  skill  objectives  for  the  language 
training  of  security  service  personnel  and  also  a  considerable 
ar.ount  of  information  about  the  enabling  objectives.  We  will 
not,  therefore,  discuss  this  area  ol:  language  training  further 
today. 

But  for  the  students  who  go  to  other  commands  and  agencies, 
we  have  no  systematic  method  for  gathering  similar  information. 
This  was  a  major  findiry  of  the  Army  Linguist  Personnel  Study 
(commonly  referred  fcc  as  the  ALPS)  conducted  by  DCSPer  in  1975 
and  1976.  The  ALPS  study  helped  define  the  problems  so.uewhat, 
although  narrowly.  It  deals  only  with  the  Army,  of  course. 

And  it  dealt  predominantly  with  administrative  problems  rather 
than  with  technical  problems— as  this  paper  has  also  done  so  far. 

Now,  however,  let  us  turn  to  the  technical  problems,  since 
these  are  the  ones  that  will  be  of  greatest  interest  to  testing 
specialists  in  an  audience  such  as  this. 
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The  terminal  skill  objectives  for  a  foreign-language  course 
are  typically  stated  in  terms  of  the  various  communication 
behaviors  the  graduate  is  required  to  perform.  Such  objectives 
are  centered  around  verbs  such  as  reads,  translates,  speaks, 
converses,  listens,  and  so  on. 

The  enabling  objectives  for  an  instructional  system  leading 
to  these  kinds  of  skills  are,  on  the  other  hand,  mostly  stated 
in  terms  of  the  specific  vocabulary,  tenses,  case  endings, 
idiomr,  technical  jargon,  and  so  on,  that  the  student  must 
master  in  order  to  perform  the  communication  skills. 

We  are  necessarily  dealing,  then,  with  two  different  domains: 
the  domain  of  communication  skills  and  the  domain  of  linguistics. 
Defining  and  circumscribing  each  of  these  domains  separately 
presents  problems  which  are  beyond  the  present  state  of  the  art. 
But  what  is  still  more  difficult  is  reconciling  or  combining 
the  two  domains  in  a  way  that  is  meaningful  for  the  writer  of 
terminal  and  enablfcg  objectives  and  for  the  designer  of  the 
criterion-referenced  test  that  will  be  used  to  determine  whether 
the  student  is  sufficiently  trained  for  adequate  performance  on 
the  job. 

If  you  are  teaching  Morse  code,  to  take  a  simple  example, 
the  entire  domain  is  comparatively  easy  to  describe.  It  is 
quite  finite.  Each  enabling  objective  can  be  readily  listed 
and  its  achievement  can  be  measured  with  some  precision. 
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Furthermore,  the  performance  test  can  include  virtually  every 
element  and  significant  condition,  so  the  instructional  system 
can  be  designed  quite  specifically  to  prepare  the  trainee  to  pass 
the  test.  One  cannot  say  the  same  thing  about  a  foreign 
language. 

Because  the  desired  product  of  language  training  is  also 
performance,  applied  performance  testing  is  highly  desirable. 

This  point  was  made  at  the  1975  MTA  conference  in  Indianapolis 
by  Dr.  Thomas  Sachse  of  the  Northwest  Regional  Education  Lab** 
oratory.  He  went  on  to  say,  however,  that  a  search  for  such 
measures  by  the  Clearinghouse  for  Applied  Performance  Testing 
produced  nothing.  That  is  not  strictly  true  today,  but  shows 
how  new  the  idea  of  performance  testing  is  in  the  field  of  foreign 
language  training.  If  it  were  easily  done,  there  probably  would 
have  been  at  least  something  to  be  found  in  1975. 

The  vocabulary  of  any  language  is  so  large  that  not  even  a 
native  speaker  of  the  language  knows  it  all.  Obviously,  we  do 
not  expect  the  trainee  to  learn  more  words  than  a  native  speaker 
knows.  The  question  is,  "Which  words  shall  be  omitted?"  It 
might  seem  logical  to  omit  those  that  are  not  known  to  the 
average  native  speaker.  Unfortunately,  we  have  been  unable  to 
locate  that  critically  important  individual.  Every  native 
speaker  seems  to  know  a  somewhat  different  sample  of  the  total 
vocabulary'  of  the  language,  depending  on  his  or  her  education, 
profession,  experience,  etc.  All  right,  then,  let's  take  the 
vocabulary  that  most  educated  native  speakers  have  in  common. 
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That  should  be  sufficient  for  our  trainee:  after  all,  it  seems 
to  oe  sufficient  for  the  native  speakers.  Unfortunately,  that 
common  vocabulary  is  also  far  too  large  for  any  practical 
instructional  system. 

So  let's  try  a  different  approach.  We  know  that  many  people 
cat.  get  along  fairly  well  with  a  limited  vocabulary  by  choosing 
their  words  carefully.  If  you  have  mastered  about  2,000  words 
out  of  the  tens  of  thousands  in  a  language,  you  can  probably 
express  yourself  adequately  in  most  situations.  These  words 
would  be,  fcr  the  most  part,  the  words  that  occur  with  highest 
frequency  in  the  language.  One  problem  with  this  approach  is 
that,  while  the  trainee  can  limit  his  own  speech  to  the  2,CC0 
word*  he  knows,  he  has  no  control  over  the  native  speaker  he  is 
conversing  with.  The  native  speaker  may  very  well  use  his  entire 
vocabulary  and  the  trainee  will  be  required  to  understand  it  if 
he  is  to  carry  on  an  intelligible  conversation. 

Now,  we  know  from  experience  that  such  trainees  can  learn 
to  make  fairly  good  guesses  about  the  words  they  do  not  know, 
but  it  is  very  difficult  to  establish  firm  objectives  regarding 
the  number  and  the  required  degree  of  success  at  such  guessing. 

co,  we  know  in  advance  that  our  objectives  will  contain 
some  elements  of  ambiguity**- some  factors  that  will  not  be  well 
delineated  and  that  will  cause  some  problems  for  the  writers 
of  criterion  referenced  tests. 


Can  job  and  task  analysis  help  us  to  define  the  objectives 


with  greater  precision?  One  would  hope  sc,  but.  keeping  in  mind 
that  it  will  be  quite  impossible  in  any  practical  period  of 
study  to  gather  enough  data  for  full  confidence  that  all  possible 
conversational  content  and  situations  have  been  collected. 

It  should  be  noticed,  by  the  way,  that  by  concentrating  on 
vocabulary  alone,  we  have  said  nothing  so  far  about  grammatical 
structures,  idioms,  dialectal  variations  in  pronunciation,  and 
many,  many  other  aspects  of  a  foreign  language— all  of  which 
should  be  adequately  described  in  the  enabling  objectives  for  an 
instructional  system.  Of  all  the  separate  problems  that  these 
raise  for  analysis,  for  the  writing  of  objectives,  and  design 
of  tests,  let  me  mention  just  one. 

It  is  important  that  we  identify  the  types  of  persons  that 
the  student  will  have  conversations  with,  and  the  circumstances 
of  the  conversation,  because  languages  change  significantly 
from  place  to  place  and  person  to  person.  We  are  all  aware  of 
the  fact  that  we  do  not  speak  exactly  the  same  way  in  giving  a 
briefing  to  the  commander  as  we  do  at  a  poker  game  with  friends, 
or  with  our  wives.  Linguists  therefore  talk  about  the  different 
"registers"  of  the  language,  and  they  have  a  great  deal  to  say 
about  how  the  vocabulary  changes  in  different  situations.  What 
is  not  so  obvious  is  that  the  orammar  changes,  too,  and  some¬ 
times  quite  drastically. 
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Some  of  these  changes  might  be  called  grammatical  errors  by  a 
language  teacher,  but  linguists  sometimes  argue  whether  these 
are  really  sub-standard  usages  or  represent  quite  correct  gram¬ 
matical  forms  for  the  specific  situations.  The  English  language 
is  said  to  have  five  distinctly  different  registers,  and  it  is 
probable  that  all  other  languages  have  as  many.  So  it  is  a 
legitimate  question  to  ask  which  of  these  registers— or  degrees 
of  formality— of  the  language  we  should  teach  our  students  to 
understand  and  to  use. 

Now,  seme  of  you  will  have  noticed  that,  while  I  have  not 
specifically  addressed  my  remarks  to  the  question  of  criteria, 
many  of  these  comments  about  determining  the  nature  of  the 
objectives  themselves  have  also  necessarily  touched  on  how  well 
we  expect  the  student  to  perform  those  skills.  For  many  jobs, 
such  as  Horse  code,  we  can  state  the  criteria  with  mathematical 
precision,  that  is,  with  a  percentage  of  pemissable  error  at 
a  stated  speed,  and  so  on.  It  iB  not  entirely  meaningful, 
however,  to  say  that  a  person  can  carry  on  a  conversation  with 
only  10%  error,  especially  since  it  is  probably  not  too  far 
from  the  truth  to  say  that  only  about  10%  of  the  usual  conversa¬ 
tion  is  of  any  significance.  It  becomes  fairly  important, 
then,  to  be  able  to  identify  where  the  errors  occur.  Unhappily, 
we  have  no  sure  way  of  determining  that. 

Well  then,  what  can  we  do  in  the  way  of  job  and  task  analysis 
in  the  foreseeable  future? 
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First,  we  must  recognize  that  there  probably  will  be  no  way  to  solve 


i 


all  these  theoretical  and  technical  problems  right  away.  We  must, 
instead,  try  to  find  some  practical  way  of  reducing  the  area  of  doubt 
about  the  language  skill  requirements  or  the  job  and  reducing  the  amount 
of  error  In  our  determination  of  the  vocabulary,  grammatical  features, 
etc.,  that  we  Include  In  a  forelgh-language  Instructional  system.  In 
short,  we  need  some  research. 

We  have  therefore  contracted  with  Development  and  Evaluation 
Associates  of  Syracuse,  New  York,  for  a  research  study  Into  some  of  the 
problems.  This  contract  was  awarded  only  a  few  weeks  ago,  so  no 
answers  are  yet  forthcoming.  In  fact,  the  details  of  the  procedure  are 
not  yet  fully  worked  out  since  the  details  of  the  procedure  are  exactly 
what  we  want  to  get  out  of  the  project.  However,  I  can  sketch  the 
effort  In  fairly  broad  strokes. 

First  of  all,  we  must  select  a  manageable  sample  of  the  kinds  of 
jobs  we  are  concerned  with.  Even  though  we  are  excluding  the  jobs  In 
the  security  services,  there  are  hundreds  of  such  jobs.  Also,  we  must 
limit  the  study  to  a  very  few  of  the  nearly  thirty  languages  we  teach. 

So  far,  we  have  decided  to  Include  Russian  and  Chinese  Mandarin,  but 
have  made  no  other  choice  yet. 

Once  we  select  the  sample  of  jobs  and  of  languages,  we  know  we  do  not 
need  to  do  an  exhaustive  job  of  task  analysis  on  the  entire  inventory 
because  all  the  jobs  we  are  Interested  In  also  contain  duties  that  do  not 
require  use  or  a  foreign  language.  Those  non-language  tasks  are  not  a 
part  of  the  task  analysis  challenge  for  DLI. 
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Even  an  interrogator  has  a  number  of  duties  that  do  not  involve 
the  foreign  language*  He  look  at  the  Soldier's  Manual  for 
an  Army  96-Charlie  and  pick  out  those  tasks  that  must  be 
represented  in  our  terminal  skill  objectives. 

The  same  thing  is  true  for  a  naval  officer  in  NATO  Head¬ 
quarters,  for  an  Air  Force  Staff  Advisor  in  Iran,  or  a  military 
attache  in  Argentina,  except  that  no  handy  guide  such  as  a 
Soldier's  Manual  exists. 

In  each  case,  we  have  to  determine  quite  specifically  what 
communication  skills  the  job  incumbent  must  have.  Does  he 
have  to  engage  in  conversations  or  not?  Does  he  have  to  give 
briefings  in  the  foreign  language,  or  morely  attend  such  brief¬ 
ings  and  be  able  to  summarize  them?  Does  he  read  correspondence, 
or  write  letters  himself?  Must  he  sometimes  interview  people? 

If  so,  what  kind  of  people?  Is  he  required  to  make  translations? 
If  so,  what  must  he  translate:  letters,  technical  manuals  or 
directives,  political  news,  or  wnat?  Our  Job  Analysis  and 
Standards  Division  is  now  doing  a  fairly  good  job  of  identifying 
these  communication  tasks* 

But  once  these  skill  types  are  identified,  along  with  some 
standards  of  adequacy,  we  need  to  find  out,  in  some  way;  what 
general  and  special  type  of  vocabulary  is  needed,  what  idioms 
and  technical  jargon  he  will  encounter,  and  what  grammatical 
features  of  the  language  are  either  very  common  or,  even  if  rare, 
are  critical  to  performance  ct  the  job. 
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It  will  be  one  of  the  contractor's  most  important  tasks  to 
figure  out  how  to  determine  which  of  these  linguistic  features 
is  necessary,  and  just  how  many  of  them  will  be  sufficient,  to 
prepare  the  student  for  the  job.  There  are  some  serious  theoretical 
problems  as  well  as  practical  problems  to  be  solved  here. 

But  the  contract  calls  for  more  than  that.  When  he  has 
determined  what  he  believes  is  a  practical  system,  the  contract 
requires  that  he  prove  that  it  works.  He  must  then  try  to  apply 
his  method  to  the  particular  jobs  in  the  particular  languages 
that  have  been  selected  earlier,  work  out  any  difficulties  that 
arise  in  the  application,  and  provide  us  with  a  workable  method 
that  we  can  apply  across  the  rest  of  the  jobs  and  languages 
for  which  we  must  devise  instructional  systems. 

The  entire  contract  effort  is  expected  to  take  21  months. 

I'd  like  to  conclude  with  a  few  remarks  about  what  this  may 
mean  to  those  of  you  who  are  not  engaged  in  foreign-language 
instruction.  The  Defense  Language  Institute  Foreign  Language 
Center  is  not  the  only  military  organisation  that  has  problems 
of  domain  definition,  and  difficulties  in  applying  the 
principles  of  job  and  task  analysis  to  determination  of  both 
terminal  and  enabling  objectives.  Many  of  the  rest  of  you  face 
serious  problems  in  developing  criterion-referenced  tests  for 
the  special  skills  you  are  concerned  with. 
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For  example,  the  theory  of  criterion- referenced  testing 
requires  that  all  terminal  objectives  be  represented  in  the 
final  CRT  for  an  instructional  system,  but  some  of  you  share 
our  problem  in  that  the  number  of  objectives  is  far  too  large 
for  inclusion  in  a  CRT  of  practical  length.  In  such  cases, 
the  final  CRT  can  only  sample  the  required  behaviors. 

It  seems  likely  that  some  of  you  also  have  a  problem  parallel 
to  ours  in  that  the  skill  you  teach  is  not  the  actual  content 
of  the  MOS  or  the  AFSC  or  wMtever,  but  a  skill  through  which 
the  job  is  performed.  This  is  a  particularly  sticky  problem  for 
job  analysis  and  for  test  validation.  Suppose  we  find,  for  example 
a  logistical  advisor  in  Iran  who  is  not  performing  adequately. 

How  do  wo  determine  whether  his  failure  is  the  result  of  inadequate 
proficiency  in  the  language  or  because  he  doesn't  know  his 
logistical  channels  well  enough? 

In  the  Department  of  Defense,  a  foreign  language  is  never 
learned  for  its  own  sake,  but  in  order  to  do  something  else. 
Therefore,  unlike  other  military  skills,  such  as  performing 
maintenance  on  an  engine,  on  which  the  individual  concentrates 
wholly,  the  skill  of  language  is  one  that  must  be  performed 
while  working  toward  another  purpose,  such  as  the  obtaining  of 
information  or  persuading  someone  else  to  do  something,  or  whatever 
it  may  be.  The  language  skills  must  be  so  thoroughly  established 
that  they  are  habitual  and  require  virtually  no  conscious  attention 
and  are,  furthermore,  wholly  different  in  nature  from  the  job 
to  be  performed. 
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If  nothing  else,  then,  the  research  project  points  up  the 
difficulty  of  applying  routine  methods  of  job  and  task  analysis 
to  skills  of  this  type.  At  best,  it  may  provide  some  methodo¬ 
logical  clues  that  may  be  of  u*e  to  schools  with  similar  problems. 
If  so,  I  assure  you  that  we  will  report  these  to  you  two  years 
from  now. 
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STUDY  OF  TASK  DIFFICULTY  USING  FIELD  TEAMS 
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Mr.  Fred  L.  Hart 


ABSTRACT 


The  history  of  the  scientific  study  of  work  Is  summarized  from 
early  work  by  Charles  Babbage,  better  known  for  his  early  attempts 
to  build  a  digital  computer.  Historical  work  by  F.W.  Taylor,  the 
Gllbreths,  t.H.C.  Tippett,  H.  Munsterberg,  and  R.  Gagne  and  R.  Mager 
are  compared  In  relation  to  the  task  inventory  method.  The  state  of 
the  art  Is  represented  by  the  work  of  Ray  Chrlstal  at  AFHRL  and  his 
counterparts  In  other  services.  Current  capabilities  of  CODAP,  task 
Inventory  methods,  and  the  4-factor  model  are  summarized  as  a  basis 
for  other  panelist  p**esentat1ons. 
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This  panel  Is  concerned  with  Job  Task  Analysis  and  with  the 
things  currently  being  done  In  the  several  US  Services  to  improve  our 
ability  to  analyze  and  describe  jobs.  Today  we  will  be  primarily 
interested  in  describing  jobs  for  the  purpose  of  deciding  the  content 
cf  training.  He  will  be  talking  principally  In  terms  of  job  task 
Inventories,  which  have  been  developed  using  the  CODAP  system.  Each 
of  the  armed  services  Is  now  able  to  describe  certain  enlisted  special¬ 
ties  with  lists  of  specific  tasks  which  people  actually  do  on  the  job. 
The  CODAP  system  (NOTAP  In  the  Navy)  clearly  represents  the  best  means 
yet  available  for  the  scientific  description  of  work. 

It  may  help  us  to  recall,  for  a  moment,  the  history  of  this  effort. 
Modem  efforts  to  describe  work  scientifically  and  to  use  that  data 
In  scientific  management  are  about  150  years  old.  During  that  time 
there  has  been  an  evolution  of  Ideas  and  of  techniques  which  Is 
Interesting  to  recall.  Furthermore,  It  may  be  useful  for  us  to  be 
aware  of  the  key  approaches  taken  in  the  field,  and  of  some  present 
alternatives  to  the  task  inventory  method. 

He  are  talking  about  description  of  the  work  process.  Possibly 
the  first  scientific  attempt  In  this  direction  was  that  of  Charles 
Babbage  In  England,  who  In  1832  published  his  On  the  Economy  of 
Machinery  and  Manufacturers.  Some  of  you  will  remember  the  name. 

Babbage  was  the  distinguished  mathematician  who  Is  the  true  father  of 
computing  machinery.  HI th  the  primitive  means  at  disposal  he  built 
"analytic  engines"  which  computed  mathematical  tables  still  used  until 
recently.  His  most  ambitious  machine  was  never  built,  but  would  have 
been  equal  to  a  small  modern  comouter.  His  computers  Incorporated 
most  features  of  modern  machines.  Including  punched  card  Input, 
separate  registers  for  data  and  program,  and  automatic  printout. 

Babbage  Is  less  well  known  for  the  fact  that  he  pioneered  In  the 
study  of  the  Industrial  work  process  and  In  the  rationalization  of 
manufacturing.  His  methods  were  reminiscent  of  "operations  research," 
as  that  term  was  used  by  the  British  in  the  '40s  and  '50s.  He  was 
particularly  interested  in  quantifying  work,  and  his  practice  was  In 
some  respects  very  modem.  His  book  was  republished  several  times  In 
Britain  and  the  United  States,  and  his  methods  were  applied,  for 
Instance,  to  the  manufacture  of  pins  and  in  the  British  Post  Office. 

As  a  result,  his  research  In  the  work  process  achieved  more  practical 
application  than  his  "analytical  engines." 

Often  a  genius  seems  to  be  before  his  time;  so  It  was  with  Babbage. 
His  notions  on  the  computer  and  on  the  study  of  work  were  not  system¬ 
atically  followed  up,  although  they  may  have  stimulated  some  similar 
writings  by  J.R.  Perronett  In  France. 


It  was  not  until  a  half  century  later  that  F.W.  Taylor  laid  the 
foundation  for  a  science  of  work  management.  Taylor  described  the 
basic  principles  of  “time  study“  by  which  any  job  was  divided  into 
elementary  tasks,  measured  for  time,  and  ordered  for  productivity. 
Taylor's  emphasis  was  upon  time  for  performance.  He  promoted  the 
production  of  more  products  per  worker  and  per  unit  time.  For  this 
reason,  some  of  his  results  have  been  considered  dehumanizing.  He 
observed  motion  and  decision  making  only  Incidentally. 

The  concept  of  difficulty  *s  very  Important  to  us,  because  It 
affects  the  differentiation  of  workers  on  the  basis  of  aptitude. 
Difficulty  Taylor  did  not  address,  except  as  it  may  have  been  Implied 
In  his  use  of  terms  such  as  "delay, “  "complexity,"  "skill  level"  and 
"effort."  Although  he  did  consider  training  and  worker  selection, 
he  did  not  treat  difficulty  as  an  independent  concept. 

In  the  1900s  Frank  and  Lillian  Gllbreth  originated  "motion  study," 
incorporating  a  consideration  for  the  physiological  and  psychological 
capacities  of  the  worker.  They  used  photographs  to  document  work 
processes,  and  they  Implied  difficulty  when  they  discussed  the  problems 
of  learning,  fatigue,  monotony,  attention  and  decision  making.  During 
World  War  I,  the  Gilbreths  developed  training  In  the  assembly  and  dis¬ 
assembly  of  weapons,  initiating  (along  with  Hugo  Munsterberg)  the 
tradition  of  military  human  factors  research  which  we  are  continuing. 


Babbage,  Taylor  and  the  Gilbreths  represent  an  approach  centered 
on  engineering  and  management.  In  1913,  Hugo  Munsterberg  became  the 
first  formally  trained  psychologist  to  study  industrial  management, 
and  treated  problems  of  task  learning,  adjustment  to  physical  condi¬ 
tions  and  economy  of  motion.  He  was  active  in  applied  military  re¬ 
search  during  World  War  I.  It  Is  of  Interest  that  the  practice  of 
testing  for  personnel  selection  dates  effectively  from  the  Army  Alpha 
and  Beta  tests  that  were  used  during  the  1915-18  mobilization. 

After  the  war,  L.H.C,  Tippett  studied  the  British  textile  industry, 
using  statistical  methods  (Tippett,  1935).  Tippett's  contribution 
included  the  concept  of  activity  sampling  (Bamos,  1956).  The  statis¬ 
tical  approaches  Tippett  used  foreshadow  the  methods  of  modern  man¬ 
power  research,  especially  In  the  military,  where  statistical  treat¬ 
ment  makes  It  possible  to  generalize  about  the  characteristics  of 
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work  and  of  populations  of  workers.  Like  others  before  and  since, 
Tippett  avoided  confronting  the  concept  of  "difficulty"  as  a  task 
variable. 


Time  and  Notion  Study 

From  these  and  other  efforts  a  descriptive  science  of  work  had 
emerged  by  the  1950s,  often  referred  to  as  "time  and  motion  study." 

In  this  classical  practice,  "job  difficulty"  was  seen  as  that  which 
forces  a  worker  to  perform  at  less  than  a  "standard  pace"  theoretically 
achievable.  These  ergonomic  difficulties  encompassed  a  numher  of 
factors  which  might  Impede  the  pace  of  work,  such  as  weight  to  be 
overcome,  distances  to  be  moved,  hand-eye  coordination  and  asymmetry 
of  movement.  Not  Included  was  any  notion  of  difficulty  resulting  from 
complexity  of  Information  processing,  nor  from  effort  to  acquire 
knowledge  or  skill.  Central  nervous  system  processing  steps  such  as 
"search,"  "plan"  and  "select"  were  considered,  but  were  seen  only  as 
"...elements  that  tend  to  retard  accomplishment..."  As  a  result,  a 
standard  text  on  Work  Measurement  (Abrurzl,  1956)  devotes  half  a 
column  of  Its  Index  to  "dolay,"  but  does  not  list  "difficulty."  The 
research  so  far  described  laid  the  basis  for  a  science  of  task  analysis, 
but  It  provided  no  metric  for  difficulty  which  would  assist  the 
decision  to  train,  to  select  job  content,  or  to  select  personnel. 


Job  Evaluation 


Since  about  1920  there  has  been  continuing  development  of  schemes 
for  the  classification  and  grading  of  jobs.  That  effort  has  had  little 
Interaction  with  the  engineering-management  approach  or  applied 
psychological  approach  wc  have  just  outlined.  Job  evaluation  Is  con¬ 
cerned  with  the  practical  considerations  of  wage  and  grade  administra¬ 
tion.  It  seeks  to  rate  jobs  on  the  basis  of  relative  scarcity  of 
qualified  workers,  or  expedient  dollar  costs  for  *ork.  The  job  eval¬ 
uation  matrix  is  value;  "difficulty"  Is  Impllvfd,  but  Is  not  rigorously 
defined.  Task  structure  Is  never  described,  even  on  a  sampling  basis. 

Systems  were  first  developed  which,  In  general,  measured  jobs  In 
their  entirety  rather  than  task  by  task.  Early  techniques  used  simple 
whole-job  rankings.  In  which  a  panel  would  rank  a  set  of  existing  jobs. 
About  1922,  the  Grading  or  Classification  Method  appeared,  pioneered 
by  the  Bureau  of  Personnel  Research  at  the  Carnegie  Institute.  This 
method  required  the  design  of  a  scale,  containing  descriptive  state¬ 
ments  of  the  levels  of  duty,  responsibility  of  knowledge  required  at 
each  rating  level.  Raters  would  then  match  jobs  to  the  scale, 
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occasionally  using  a  factor  weighting  system,  for  instance  60/;  filing, 
4QX  typing.  This  system  survives  in  the  current  practice  of  the  U.S. 
Civil  Service  Commission.  A  later  system  described  by  Lott  required 
Identifying  "factors,'*  often  a  psychologically  mixed  bag:  skill, 
strength,  effort,  responsibility,  mental  requirements,  or  aptitude  for 
learning  might  be  considered  together  In  a  list  of  8-15  or  more 
factors.  Different  jobs  were  then  classified  according  to  the  degree 
or  level  to  which  each  factor  was  required  for  each  job.  Factors  were 
weighted  for  value,  and  job  ratings  derived  by  various  algorithmic 
procedures. 

A  further  elaboration  was  the  factor-comparison  method,  attributed 
to  Benge  or  to  Mitten.  This  system  used  a  larger  number  of  factors, 
not  common  to  all  jobs.  Each  factor  was  ranked  Independently,  using 
a  benchmark  job  factor  scale  which  might  Include  a  dollar  rate  of  pay 
Index  to  assist  factor- to-factor  comparability.  In  spite  of  the  number 
of  "factors,"  there  was  no  systematic  task  breakdown. 

The  use  of  "factors"  In  job  evaluation  Is  comparable  to  a  similar 
use  of  factors  In  describing  tasks.  The  various  CODAP  task  Inventories 
match  Individual  tasks  to  numerical  values  for  factors  such  as 
difficulty,  criticality,  frequency  of  performance  and  time  required  to 
perform.  Notice  the  difference:  job  evaluation  assigns  factors  to 
the  job  as  a  whole.  The  task  Inventory  approach  describes  each  task 
separately. 

Job  evaluation  schemes  met  the  need  of  organisations  to  establish 
hierarchies  of  jobs  which  were  related  to  job  market  conditions,  and 
which  could  be  defended  as  objective.  They  contribute  little  else  to 
management.  The  concept  of  "difficulty"  they  assume  Is  a  mix  of 
♦actors,  including  factors  of  aptitude,  learned  competencies,  job 
attract  1vem;:*s,  and  conditions  of  the  labor  market. 


Modem  Bevel  orients 

Duving  ?nd  following  World  War  II,  new  directions  of  effort  emerged. 
One  was  "OperAtions  Research,"  a  British  term  for  mathematical  modeling, 
to  find  opting  >tr*ter1es  or  paths  within  systems  with  several  Inter¬ 
acting  variable-,,  F-om  this  concept  of  whole-system  quantification 
came  development  of  the  "systems  approach"  for  planning  within  U.S. 

Arrsd  Services.  The  term  had  at  least  two  Implications  of  Interest 
here* 


First,  it  made  obvious  the  need,  in  developing  new  weapons 
systems,  to  recognize  human  requirements  and  to  have  trained  manpower 
ready  to  man  the  new  hardware  when  it  entered  service.  This  required 
a  means  to  predict  the  content  of  future  jobs.  Manpower  research, 
hitherto  a  measurement  science,  was  required  to  undertake  prediction. 

A  second  and  related  implication  of  the  systems  approach  was  that 
a  whole-system  model  should  be  applied  to  the  cycle  of  recruitment, 
selection,  training,  and  career  progression.  Scientifically  quanti¬ 
fied  means  were  needed,  by  which  to  make  management  decisions  regarding 
the  apportionment  of  recruit  talent  among  competing  fields,  the  content 
of  training,  whan  and  how  to  train,  how  to  apportion  tasks  among 
workers,  and  how  to  structure  specialties  and  <ki.1  levels  among 
servicemen.  It  was  to  meet  this  need  that  th'*  USAF  began  personnel 
and  manpower  research  In  the  early  1950s,  and  Initiated  Its  occupation¬ 
al  survey  programs. 

Meanwhile  new  dimensions  were  added  to  the  practice  of  work  study 
by  behavioral  scientists. 

The  term  "human  engineering"  became  common.  Generally  that  term 
Identified  the  measurement  of  human  strength,  speed,  perception  at.d 
motor  skill  In  relation  to  the  control  of  machines.  This  work  con¬ 
tributes  to  our  understanding  of  the  possible  dimensions  of  difficulty. 
Ryan  treats  difficulty  as  task  effort  required,  a  function  of  foot¬ 
pounds  of  physical  work,  and  metabolism.  Required  effort  may  be  In¬ 
creased  by  the  Intrusion  of  fatigue,  physloloylcal  stress,  or  Induced 
Inefficiency  caused  by  psychological  stress  or  boredom.  Mundell 
recognizes  six  categories  of  physiological  "difficulty:"  (1)  the 
percent  of  to'al  body  required  to  perform  a  task,  (?)  whether  or  not 
foot  controls  are  required,  (3)  whether  hand  motions  required  are 
simple  or  complex,  (4)  level  of  hand-eye  coordination  required,  (5) 
grossness  of  hand  motion  tolerated,  and  (6)  weight  (force)  to  be  moved. 
Such  studies  provide  a  limited  taxonomy  of  physiological  difficulty, 
but  are  little  help  whe-re  the  primary  obstacles  to  task  performance 
a"e  the  need  for  skills  or  knowledge.  Mention  of  human  information 
processing  is  stil  1  noticeably  absent. 

The  term  "job  analysis"  describes  varying  methods  for  acquiring 
data  descriptive  of  a  job.  That  practice  is  an  advance  in  relation  to 
both  time  and  motion  study  or  job  evaluation,  combining  features  of 
both.  Job  analysis  seek*  to  provide  a  critical  job-content  profile, 
using  as  source  data  interviews,  activity  logs,  observation  logs, 
critical  incident  reports,  biographical  Inventories,  standardized 
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tests,  or  similar  material.  A  profile  of  required  aptitudes,  traits 
and  competencies  Is  derived,  and  Ideally  Is  then  validated  to  the 
population  for  which  It  Is  designed,  using  a  regression  equation  or 
a  multiple  cut-off  strategy. 

The  term  "task  analysis"  often  Is  used  Interchangeably  with  "job 
analysis."  That  Is  unfortunate,  for  there  is  a  critical  difference. 

In  Task  Analysis  the  focus  Is  on  the  component  tasks  of  a  job.  Tasiu 
are  to  be  described  In  terms  of  specific  stimuli,  standards,  conditions 
and  actions,  or  alternatively  cues,  cond  1 1 1  on's .  actions,  and  product's. ' 
Thers  Ts  an  obvious  parallel  with  Hager ‘s  statement  of  the  elements' 
required  for  an  Instructional  objective.  Most  task  analyses  have  the 
remaining  weakness  that  they  are  descriptive,  rather  than  quantitative, 
and  therefore  difficult  to  compare  one  to  another.  They  resist 
computer  handling,  and  are  not  convertible  by  an  algorithm  either  to 
training  objectives  or  to  aptitude  requirements. 

Such  quantlficftlon  Is  possible  however,  using  Job  Inventories 
as  developed  by  AFHRL. 


USAF  Research 

USAF  established  Its  Occupational  Research  Project  In  1958, 
litigating  a  systematic  program  for  the  study  of  military  and  civilian 
jobs  In  relation  to  human  resources.  A  major  product  of  that  effort 
was  the  development  of  job  descriptive  Information,  under  management 
of  the  CODAP  analysis  system,  with  quantified  factor  data.  AFHRL 
selected  the  job  Inventory  approach  as  being  feasible,  and  as  providing 
aata  which  could  be  quantified  and  manipulated  by  computer.  A  job 
Inventory  Is  a  list  of  all  tasks  normally  performed  by  a  worker,  derived 
from  a  survey  of  job  incumbents.  AFHRL  prepared  comprehensive  lists  of 
task  statements,  derived  from  several  sources.  After  several  steps  of 
technical  editing  and  review,  those  lists  were  mailed  to  workers.  Each 
worker  checked  off  those  tasks  he  normally  performed,  wrote  In  any 
tasks  •  '  ich  were  not  listed,  and  recorded  time  spent  on  a  relative 
scale.  Using  this  procedure,  the  job  of  any  worker  or  group  of 
workers  can  be  defined  by  a  subset  of  tasks  from  the  Inventory. 

As  you  know,  CCOAP  provides  a  sophisticated  and  flexible  means 
for  managing  the  task  Inventories  and  associating  them  with  factor 
data  of  various  kinds.  The  COCAP  system  has  been  adopted  or  cjodlfled 
by  each  of  the  Armed  Services  for  its  own  use.  I  will  leave  It  to  the 
panel  members  present  today  to  comment  further  on  what  those  services 
are  doing.  But  I  do  want  to  make  one  point  about  task  Inventories  and 
training. 
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Data  for  Training 


One  of  the  primary  reasons  for  the  scientific  study  of  work  Is 
so  that  we  can  train  workers  effectively.  The  systems  model  for 
Training  -  130  -  presumes  that  training  will  be  based  on  the  content 
of  jobs.  The  ISO  process  requires  that  training  development  begin 
with  a  task  analysis.  As  a  matter  of  fact,  the  ISO  documents  are  a 
Utile  vague  about  how  this  will  be  done. 

Conventional  task  analysis  was  descriptive.  As  usually  practiced. 
It  was  useful  for  management,  but  provided  little  specific  data  for 
training  developers.  A  typical  task  analysis  reached  a  level  of 
specificity  close  to  that  of  a  course  description,  but  not  approaching 
the  time-by-task  by  task  description  needed  to  write  learning  materials 
A  specific  job  might  be  described  differently  by  two  different  analysts 
and  that  data  was  not  suitable  for  computer  management. 

The  COOAP  task  Inventories  take  us  an  order  of  magnitude  closer  to 
what  training  developers  need.  Yhese  data  can  be  replicated  and 
machine  processed.  The  task  statements  are  moderately  specific  and 
use  a  disciplined  vocabulary.  Each  task  can  be  tagged  with  valuable 
data  such  as  the  frequency  of  task  performance.  Still  the  CODAP  task 
statements  are  not  readily  convertible  to  training  objectives. 

A  typical  USAF  task  Inventory  contains  from  200  to  1000  or  more 
task  statements,  describing  the  work  in  one  Air  Force  specialty  (AFS). 
Yet  a  single  task  statement  such  as:  "Perform  alignment  of  aircraft 
H.F.  Radio  Receiver"  (AFSC  32850),  actually  describes  over  100  dis¬ 
crete  task  steps.  Cach  of  these  steps  Includes  Its  own  distinctive 
actions,  condition?  and  standards.  Just  how  those  atomic  learning 
of'jectives  are  to  be  scientifically  described,  and  based  on  the 
objective  study  of  work,  ISO  does  not  tell  us. 

Several  of  the  speakers  today  are  taking  action,  within  their  own 
services,  to  develop  methods  by  which  training  may  be  objectively 
based  on  work,  at  the  level  of  a  subtast-  or  the  level  of  a  programmed 
Instruction  frame.  With  you,  I  look  forward  to  hearing  the  speakers 
who  follow. 


A  MANAGEMENT  FEEDBACK  SYSTEM 


FOR  THE  AiR  FORCE 

MAJ  JON  L.  GROSS 


ABSTRACT 


Systematic  feedback  on  the  management  of  people  Is  required  to 
Improve  both  management  and  organizational  effectiveness.  Rensls  Likert's 
questionnaire,  "Profile  of  Organizational  Characteristics,"  Is  a  proven 
technique  for  providing  systematic  feedback.  The  military  manager's  use 
of  this  questionnaire  will  help  to  Improve  the  maragement  of  the  most 
critical  resource--pecple. 
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A  MANAGEMENT  FEEDBACK  SYSTEM  FOR  THE  ATR  FORCE 


Most  Air  Force  managers  are  familiar  with  the  func¬ 
tions  of  managements  planning,  organizing,  directing, 
controlling,  and  coordinating.  All  managers  engage  in 
these  functions  to  some  extent  in  their  efforts  to  improve 
and  maintain  the  efficiency  of  their  operations  or  programs. 
Likewise,  to  determine  their  effectiveness  as  managers,  they 
need  somo  means  of  measuring  the  results  of  their  actions 
on  people  and  on  their  operations.  Rensis  Likert's  "Profile 
of  Organizational  Characteristics"  is  a  proven  technique  for 
evaluating  management  style  and  organizational  effective¬ 
ness.  Use  of  the  Likert  questionnaire  can  help  the  manager 
to  determine  his  current  management  style  and  identify  areas 
that  should  be  changed. 

A  key  to  management  improvement  is  a  pocitive  attitude 
on  the  manager's  part  regarding  feedback  and  a  willingnes- 
to  evaluate  the  inpact  of  his  actions  on  people  and  his 
organization.  Such  an  attitude  can  lead  him  to  consider 
various  styles  and  approaches  in  his  attempts  to  improve 
his  ability  to  manage.  The  Likert  questionnaire  can  serve 
as  an  initial  step  by  helping  him  to  determine  his  current 
management  style  and  by  identifying  areas  that  should  be 
changed. 

Purpose  of  Management  Feedback  System 

The  participative  management  approach  is  a  current 
trend  in  the  management  of  people  in  the  military.  Some 
managers  resist  changing  from  the  traditional  authoritarian 
approach  for  fear  that  discipline  will  diminish  and  people 
will  no  longer  respond  to  orders.*  Most  management  authori¬ 
ties  believe  that  the  participation  of  people  in  decisions 
that  affect  what  and  how  they  do  on  their  job  is  the  most 
valid,  long-term  approach  to  mproving  the  effectiveness 
of  people.  Current  problems  require  the  fullest  use  of 
people  in  order  to  be  prepared  for  war  if  it  comes. 

Traditionally,  managers  receive  information  on  how  to 
improve  coat  control,  make  better  decisions,  and  manage 
people.  In  each  of  the  first  two  areas,  feedback  tech¬ 
niques  have  been  developed  and  incorporated  in  the  manage¬ 
ment  of  programs  and  operations.  By  identifying  areas  that 
require  corrective  action,  the  information  feedback  loop 


841 


f 

l 

% 


has  proved  helpful  in  controlling  programs.  However,  few 
techniques  **re  currently  used  to  provide  systematic  feed¬ 
back  to  military  managers  on  what  is  happening  with  people 
as  a  result  of  management  decisions  or  actions.  Admittedly, 
"open  door"  policies  and  various  councils  provide  some 
information,  but  managers  normally  depend  on  these  approaches 
to  identify  and  resolve  grievances.  If  things  are  "OK"  or  if 
the  level  of  complaints  is  reasonably  low,  management  usu¬ 
ally  takes  no  action. *  When  these  feedback  systems  identify 
areas  that  require  change,  the  manager  may  focus  only  on  the 
most  visible  and  apparent  problems.  Apparent  problems  may 
be  solved,  but  the  covert  cause  of  a  problem  may  remain  to 
create  other  apparent  problems.3 

A  factor  that  generates  a  need  for  an  effective  manage¬ 
ment  feedback  system  in  the  military  is  the  trend  toward 
longer  assignment  tours  for  managers  and  their  personnel. 

A  traditional  attitude  has  been  expressed  in  these  words: 
"Don't  let  th-i  personality  problems  bother  you  and  affect 
your  actions.  One  of  you  will  be  transferred  in  a  year  or 
so,  and  you  can  get  by  for  that  long."  Longer  tours  place 
greater  emphasis  on  solutions  to  problems  that  affect  the 
performance  of  people.  People  want  to  feel  that  they  are 
needed  and  useful,  that  their  work  is  important,  and  that 
they  have  a  voice  in  what  happens  to  them.  If  they  cannot 
find  satisfaction  in  their  jobs  and  if  they  cannot  expect 
transfer*,  they  will  look  outside  their  jobs  for  satisfac¬ 
tion.  People  need  to  understand  the  importance  of  their 
efforts,  know  that  they  have  a  stake  in  what  happens  to 
them,  and  feel  dedicated  and  satisfied. 

Just  as  people  are  keys  to  greater  output  with  dimin¬ 
ishing  resources,  the  manager  is  the  key  to  more  effective 
use  of  people.  Feedback  to  the  manager  not  only  provides 
direct  benefits  in  terms  of  dollar  costs;  it  also  informs 
the  manager  concerning  his  effectiveness  in  the  management 
of  people. 

Executives  and  managers  need  feedback  on 
their  performance  and  effectiveness,  just 
as  subordinates  do.  One  kind  of  feedback 
that  can  be  extremely  helpful  is  a  survey 
of  employees'  perceptions  of  their  super¬ 
vision,  working  conditions,  opportunities, 
and  other  factors.  This  device  ii  not  only 
communicative  in  itself;  it  also  is  a  cata¬ 
lyst  to  increase  the  quantity  and  quality 
of  communications  on  a  continuing  basis.* 
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A  Management  Feedback  System 

Rensis  Likert's  questionnaire  is  a  tested  measurement 
tool  for  providing  management  feedback  on  the  following 
organizational  characteristics: 

o  Leadership  processes 

o  Motivational  forces 

o  The  communication  process 

o  The  interaction-influence  process 

o  The  decision-making  process 

o  Goal-setting  or  ordering 

o  Control  processes 

o  Performance  goals  and  training 

The  questionnaire  consists  of  51  questions  that  measure  the 
perceptions  of  members  about  their  organization:  it  does  not 
measure  what  is  good  or  bad  and  right  or  wrong. *  The  mana¬ 
ger  must  determine  the  characteristics  that  are  appropriate 
for  his  organization.  The  organization  does  not  exist  to 
improve  itself;  it  exists  to  fulfill  a  need.  Thus,  by 
measuring  the  characteristics  of  his  organization,  the 
manager  can  guide  it  toward  the  optimum  conditions  neces¬ 
sary  for  maximum  performance  of  the  mission. 

Figure  1  depicts  the  Likert  approach  to  organizational 
improvement.^  To  measure  the  "state"  or  current  condition 
of  the  management  system,  Likert  defined  three  sets  of 
variables  as  follows: 


a.  "Causal"  variables  are  independent  variables 
that  determine  the  course  of  organizational  development. 
These  are  the  only  variables  that,  management  can  change. 

b.  "Intervening"  variables  reflect  the  internal 
state  and  health  of  the  organization. 

c.  "End-result"  variables  are  dependent  variables 
that  measure  achievements  of  the  organization. 7 
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The  presence  ot  these  variables  yields  these  variables  wrhtch  m  turn,  lead  to  these  variables 


for  System  1  or  2  and  System  4  Operation 

tFr*a  Tfcc  H»»—  OrgMmllwi  fcj  lr«t»  Likert-  1*47  hr 


“Causal"  variables  controlled  by  management  produce  changes 
in  the  "intervening"  variables,  which,  in  turn,  change  the 
"end-result"  variables.  Therefore,  the  most  effective  way 
to  change  the  organization  is  to  modify  the  "causal" 
variables. 8 

The  Likert  questionnaire  measures  causal  and  interven¬ 
ing  variables. 8  in  using  the  questionnaire,  the  manager 
can  locate  the  management  system  of  his  organization  on  a 
continuum  scale  ranging  from  Exploitive-Authoritative  (Sys¬ 
tem  1)  to  Participative  Group  (System  4).  The  two  remaining 
categories  on  the  scale  are  Benevolent-Authoritative  (System 
2)  and  Consultative  (System  3).  Likert  states  that  "the 
four  different  systems  really  blend  into  one  another  and 
make  one  continuum  with  many  intermediate  patterns. “10 

After  the  manager  initially  locates  the  management 
system  of  his  organization  on  the  scale,  he  can  then,  through 
periodic  use  of  the  scale,  track  the  movement  of  his  organi¬ 
zation.  The  thrust  of  Likert's  work  is  that  the  organization 
improves  in  effectiveness  of  production  and  management  of 
people  as  it  moves  toward  System  4  (Participative  Group) . 

Military  Examples 

The  U.S.  Navy  uses  a  modified  version  of  the  Likert 
approach  to  provide  feedback  information  to  Navy  managers 
and  supervisors.  With  the  assistance  of  the  Institute  for 
Social  Research,  University  of  Michigan,  it  has  expanded  the 
Likert  questionnaire  and  adapted  it  to  Navy  organization  and 
terminology.  The  Navy  program,  known  as  the  Human  Resources 
Management  Support  System,  includes  three  major  categories: 

a.  Human  Resources  Management — leadership, 
management,  and  overseas  diplomacy. 

b.  Equal  Opportunity/Race  Relations. 

c.  Drug  Abuse  Control  and  Alcoholism  Prevention. H 

This  discussion  limits  the  Likert  approach  to  the  leadership 
and  management  areas,  which  the  manager  directly  controls 
and  influences. 

Figure  2  provides  one  example  of  the  data  obtained  by 
the  Navy's  Human  Resources  Management  Survey. 1?  A  compari¬ 
son  of  Figure  2  with  Figure  3  shows  the  similarity  between 
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the  results  achieved  by  the  Navy  survey  and  the  results 
from  the  Likert  questionnaire.  The  broken  line  represents 
the  survey  results;  the  solid  line  represents  the  Command 
Overall  graph.  The  Navy  approach  allows  the  manager  to  com¬ 
pare  average  perceptions  within  his  organization  with 
average  perceptions  for  the  command.  The  principal  differ¬ 
ence  between  the  Navy  questionnaire  and  the  Likert  question¬ 
naire  is  that  the  members  did  not  indicate  their  organiza¬ 
tional  preferences. 

A  potential  benefit  of  the  Navy  program  is  a  determina¬ 
tion  of  the  effectiveness  of  certain  management  styles  as 
they  apply  to  specific  types  of  military  units.  Managers 
with  appropriate  styles  can  then  be  assigned  to  jobs  that 
match  their  individual  styles.  «nother  benefit  is  that  the 
manager  can  monitor  the  trend  of  organizational  change  and 
the  relative  magnitude  of  the  change  over  a  given  period. 

To  date,  the  Air  Force  has  made  only  limited  use  of 
similar  data-producing  instruments  to  improve  management 
effectiveness.  The  Leadership  Management  Development  Center 
at  Air  University  has  used  surveys  in  specific  management 
problem  areas  after  the  problems  have  been  identified  through 
interviews. The  survey  results  cover  only  short  periods; 
the  long-term  effects  of  management  decisions  have  not  yet 
been  targets  for  investigation.  Emphasis  has  been  given  to 
immediate  problems  and  immediate  results. 

The  author  of  this  article  and  two  other  Air  Force 
officers  applied  the  Likert  questionnaire  to  an  Air  Force 
research  and  development  organization.**  The  survey  results 
for  the  organization  are  shown  in  Figure  3,  which  displays 
the  average  value  for  NOW  responses  and  LIKE  responses.** 

NOW  responses  are  measures  of  members'  perception  of  the 
organization  at  the  time  of  the  survey,  tike  responses 
indicate  the  characteristics  that  members  would  prefer  in 
the  organization.  In  addition  to  the  summary  results,  the 
survey  included  specific  responses  by  military  rank  and 
civilian  grade  on  each  question. 

Implementing  Survey  Feedback  System 

The  survey  feedback  system  should  be  implemented  at 
organizational  levels  that  allow  subject  personnel  to  asso¬ 
ciate  with  a  specific  mission,  such  as  developing  a  weapon 
system,  maintaining  a  weapon  system,  or  providing  a  partic¬ 
ular  service  like  transportation.  That  is,  people  in  small 
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organizational  units  can  more  readily  identify  with  the 
program  and  participate  in  the  effects  of  the  program. 

Figure  4  shows  r  u - agram  of  the  survey  feedback  system. 
The  manager  must  decx  .e  whether  to  use  a  survey  feedback  sys¬ 
tem  in  his  organization,  and  he  must  have  support  or,  at 
least;  acceptance  of  upner-level  management.  Basically,  the 
proposal  is  a  bottom- up  program  in  which  the  manager  using 
the  system  is  supported  by  his  superior. 

The  plan  for  implementing  the  feedback  system  should  be 
developed  jointly  by  the  manager  and  his  subordinate  super¬ 
visors.  By  including  the  supervisors  in  the  planring  phase, 
the  manager  can  insure  commitment  and  understanding. 

In  the  same  manner,  the  manager  should  explain  the  sur¬ 
vey  feedback  system  to  the  members  of  the  organization.  The 
purpose  and  use  of  the  survey,  presented  and  discussed  with 
members,  should  aid  in  gaining  acceptance  of  the  system.  The 
explanation  of  the  survey  must  stress  the  fact  of  respondent 
anonymity.  Emphasis  on  the  anonymity  of  the  respondents 
should  encourage  candid  responses.  As  trust  in  the  manager 
improves,  relationships  between  the  manager  and  the  members 
of  his  organization  should  become  more  mutually  supportive. 

The  time  required  to  implement  the  system  can  be  shor¬ 
tened  with  an  existing  questionnaire  of  proven  validity, 
since  a  new  survey  requires  testing  for  reliability  and 
validity.  Problems  in  the  initial  phase  of  the  system  will 
probably  center  around  implementation  and  understanding  of 
the  survey  questionnaire.  Once  the  system  has  become  func¬ 
tional,  feedback  fo*-  the  manager  should  be  forthcoming. 

Upon  completion  of  the  survey,  the  manager  should  tabu¬ 
late  the  results  and  present  them  to  the  members  of  the 
organization.  Sharing  of  results  encourages  an  open  atmos¬ 
phere  and  gives  members  an  opportunity  to  compare  their  per¬ 
ceptions  with  group  responses.  Furthermore,  common  under¬ 
standing  of  the  results  should  enable  the  members  to 
participate  in  identifying  group  problems  and  developing 
proposed  solutions.  By  participating  in  the  development  of 
proposed  solutions,  members  will  be  more  committed  in  imple¬ 
menting  the  solutions. 

With  the  initial  survey  results  as  a  starting  point, 
the  manager  can  monitor  immediate  changes  in  his  organiza¬ 
tion  and,  through  periodic  use  of  the  survey  every  three  to 
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six  months,  he  can  evaluate  changes  in  the  characteristics 
of  his  organization  and  the  direction  of  the  changes.  He 
may  find  it  helpful  to  have  a  management  consultant;  the 
Leadership  and  Management  Development  Center  at  Air  Univer¬ 
sity  can  provide  such  support  for  Air  Force  managers. 

If  several  units  within  a  large  organization  use  the 
survey  feedback  system,  the  measurement  effort  can  be  per¬ 
formed  at  various  levels  by  applying  Likert's  linking-pin 
concept  to  tie  the  levels  of  the  larger  organization 
together. 18  it  is  not  necessary,  though  it  is  desirable, 
that  all  units  use  the  program.  One  potential  problem  of 
using  the  survey  in  some,  but  not  all,  organizations  is  that 
people  may  be  rotated  to  units  that  have  not  implemented 
management  feedback  systems. 

Summary 

This  article  suggests  that  the  Air  Force  needs  a  tool 
for  systematic  measurement  of  management  and  organizational 
effectiveness.  The  results  of  systematic  measurement  of 
management  effectiveness  should  be  available  to  the  manager 
in  the  same  manner  as  cost  accounting  and  schedule  control 
systems.  A  management  survey  is  available  to  improve  manage¬ 
ment  of  people. 

The  purpose  of  a  management  measuring  system  is  the 
improvement  of  management  and  organizational  effectiveness; 
such  a  system  is  not  intended  as  a  means  of  grading  the 
manager's  performance.  The  idea  is  to  close  the  gaps 
between  action,  reaction,  and  corrective  action.  This  is 
necessary  because  management  is  not  an  exact  science. 

Action  intended  to  produce  a  given  result  can  produce  some¬ 
thing  quite  different,  and  the  time  between  reaction  and 
corrective  action  can  be  critical. 

Consistent  with  accepted  organization  development  con¬ 
cepts,  management  actions  must  be  tailored  to  the  organiza¬ 
tion  and  its  situation.  However,  tailoring  requires  measure¬ 
ment  of  the  current  state  of  an  organization  and  establishment 
of  a  base  for  comparing  the  direction  and  magnitude  of 
change.19  Just  as  the  trial  and  error  approach  to  weapon 
system  acquisition  is  now  an  unaffordable  luxury,  so  is  the 
trial  and  error  approach  to  the  management  of  people.  Use 
of  a  proven  instrument  such  as  the  Likert  questionnaire  can 
provide  systematic,  meaningful  feedback  for  improving  the 
management  of  the  Air  Force'll  most  critical  resource— people. 
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JOB  ENRICHMENT  IN  THE  NARINE  CORPS 
(A  CONCEPT) 

BACKGROUND 

As  Defense  Department  funds  become  tighter  and  the  pressure 
increases  to  do  more  with  existing  resources,  we  must  find  ways 
to  improve  the  performance/productivity  of  our  manpower  assets. 
This  fact  of  life  i3  exacerbated  by  the  shrinking  manpower  pool 
and  generally  expanding  economy  which  will  reach  its  full  impact 
on  recruiting  in  the  early  1980s.  If  we  are  to  compete  success¬ 
fully  with  the  other  services  and  the  civilian  market  for  the 
high  quality  people  needed  in  the  Marine  Corps,  we  must  seek  new 
often  bold  initiatives  to  attract,  train  and  retain  Marines. 

During  the  first  half  of  this  decade,  the  Marine  Corps 
experienced  an  alarming  incroase  in  unauthorized  absence 
(UA)  and  desertion  rates  and  a  decline  in  retention  rates. 

Lack  of  quality  Marines  was  considered  to  be  the  most 
significant  reason  for  these  unfavorable  trends.  In  1975, 
the  Commandant  of  the  Marine  Corps  established  recruiting, 
discharge,  and  retention  policies  designed  to  raise  the 
overall  quality  of  Marines  in  the  Marine  Corps.  Consequently, 
the  unfavorable  trends  in  UA,  desertion,  and  to  a  lesser  degree 
retention,  have  been  significantly  reduced. 

Unfortunately,  these  very  successes  beget  problems  of 
their  own  in  that  more  "highly-educated"  Marines  require 
challenging  jobs,  jobs  with  substance  that  draw  on  their 
abilities,  jobs  with  which  they  can  identify  and  within  which 
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they  can  grow.  In  short,  high  quality  Karines  require  the  high¬ 
est  quality  of  leadership  and  manpower  utilization  if  we  wish 

< 

\  to  attract  and  then  both  sustain  them  and  retain  them. 

|  Major  efforts  continue  to  be  exerted  relative  to  the  best 

j  methods  for  effectively  classifying  and  assigning  Marines. 

That  effort  of  balancing  individual  ability,  potential  and 
desires,  to  mission  requirements  must  receive  continuing  atten¬ 
tion  for  it  is  the  basis  upon  which  we  can  build  meaningful 
jobs  and  a  motivated  force  of  Marines. 

Assuming  the  classification  effort  is  optimized,  con¬ 
sidering  current  constraints,  we  must  then  take  cognizance 
of  the  following  chronology  of  issues: 

First,  todays  Marines,  while  possessing  the  same  basic  needs 
as  always,  also  brings  to  our  leadership  picture  a  signifi¬ 
cantly  new  frame  of  reference  and  a  new  set  of  contemporary 
expectations  which  must  be  met  if  we  are  to  employ  rheir  talents 
optimally  in  the  accomplishment  of  our  mission. 

This  issue  of  contemporary  expectations  deserves  our  utmost 
attention.  Possibly  no  one  would  deny  any  of  our  time  tested 
leadership  principles  and  the  identified  traits  of  successful 
leaders.  Neither  should  we  deny  the  social  and  psychological 
changes  manifested  in  tne  youth  of.  today.  Tradition  has  long 
been  a  mainstay  of  military  life.  It  provides  each  t  •*  us  with 
an  identify;  it  provides  us  with  a  code  of  behavior.  Nonethe¬ 
less,  tradition  must  also  be  kept  contemporary.  Failure  to  do 
so  will  evidence  an  inability  to  identify  with  the  organization 
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|  in  daily ,  ongoing  processes.  This  fact  alone  requires  that 

i 

|  we  look  at  our  recruits  "as  they  are*  and  "not  ns  we  want  then 

I  to  be".  Once  we  get  on  a  common,  humanistic,  contemporary  plane* 

we  can  then  begin  to  blend  more  closely,  personal  and  organiza¬ 
tional  goals  into  a  common  mission  i.e.  we  must  build  on  their 
talents.  Doctor  Charles  Moskos,  Northwestern  University  at 
an  attrition  conference  in  Washington,  D.  C.  in  June  1977, 
admirably  identified  the  salient  differences  in  attitudes  between 
the  typical  draftee  in  the  late  1950s  and  early  1960s  with  those 
of  the  typical  volunteer  of  the  late  197Ps. 


INSERT  CHART  1  ABOUT  HERE 

Doctor  Moskos  stated  that  the  above  topology  should  direct 
attention  to  the  kinds  of  expectations  and  behavior  o 2  the  new 
volunteer  which  can  lead  to  high  levels  of  attrition  if  not 
properly  handled.  In  short,  pride  and  motivation  are  concepts 
that  must  be  kept  contemporary .  If  Doctor  Moskos'  analogy  has 
merit,  and  I  believe  it  does,  then  the  recruit  of  today  wants 
to  know  why  and  how  todays  situations  are  relevant  to  him. 
Yesterdays'  heroes  and  successes  were  important  in  the  derivation 
of  principles  but  we  must  now  show  their  relevance  in  an  ever- 
changing  society  and  in  the  workplace  that  he  confronts  daily. 
Again,  this  issue  of  contemporary  expectations  is  an  area 
requiring  much  exploration  and  research. 

Second,  recruitment  offers  the  young  potential  Marine,  a 
challenge  to  draw  upon  the  full  range  of  his  talents  and  then 
to  improve  his  competence  in  order  to  sustain  himself  as  a 
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flexible,  total  Marine.  We  infer  he  will  become  a  totally 
competent,  balanced  professional. 

Third,  recruitment,  in  effect,  establishes  a  psychological 
contract  between  the  potential  recruit  and  the  job  we  offer  him 
in  the  Corps.  It  is  here  that  the  expectations  of  the  potential 
Marine  are  established.  Those  expectations  and  concomitant 
challenge  should  then  be  heightened  in  Recruit  Training  where 
his  professional  training  is  initiated  and  the  often  afduous 
process  of  conversion  from  civilian  to  self  disciplined  Marine 
is  to  be  accomplished. 

Fourth,  upon  reporting  for  on  the  job  training  (OJT)  and 
the  normalizing  of  a  Marine's  work  day,  he  finally  comes  face  to 
face  with  the  reality  of  his  job.  His  identity  as  an  important 
(or  unimportant  to  him)  individual  in  a  larger  organization  with 
a  mission  is  established  here.  His  niche  in  the  unit  solidifies 
at  this  point.  His  relation  to  his  job  and  his  peers  for  the 
foraeeable  future  develop  here.  The  fulfillment  of  his  psycho- 
logical  contract  with  the  Corps,  which  was  established  during  the 
recruiting  process,  should  come  to  fruition  at  this  point  in  the 
Marine'?  enlistment.  If  he  does  not  develop  positive  attitudes 
as  a  result  of  work  that  is  meaningful  to  him  as  well  as  the  Corps, 
then  the  gap  between  expectations  and  reality  will  produce  either 
malbehavior  or  at  least  nondescript  behavior- -neither  of  which 
is  beneficial  to  either  party  to  the  contract.  Job  design  is 
paramount  at  this  point.  Sick  jobs  rapidly  initiate  feelings 
of  uselessness  and  disillusionment  for  a<  this  crucial  point— his 
job  actual— the  Marin*  sees  what  is  truth  to  him— usefulness  of 
his  talents  or  the  converse. 


859 


To  reduce  or  eliminate  the  gap  between  expectation*  and 
reality  requires  a  thorough  understanding  of  two  issues t 
(1)  what  we  do  TO  our  Marines  (context/environment/extrinsic 
factors  and  (2)  what  we  do  WITH  our  Marines  (job  content 
factors;  factors  that  provide  meaning  to  what  people  do, 
motivating  factors;  in  short  how  we  USE  our  Marines) .  These 
two  issues  are  manifestly  important  to  the  mental  health  and 
psychological  growth  issues  so  paramonunt  to  the  young,  high 
quality  recruits  we  wish  to  enlist  in  our  Corps.  Referring  to 
Doctor  Moskos'  topology  again,  we  can  plainly  see  that  in 
adequately  answering  the  questions  of  how  we  treat  and  how  we 
use  our  personnel  assets,  we  begin  to  answer  the  apparent 
existential  question  implied  in  the  profile  of  the  new  volunteer, 
which  is  to  bring  meaning  and  purpose  to  the  quality  cf  each 
Marines'  personal  and  work  life. 

THE  ISSUE 

It  is  postulated  that  if  we  are  seeking  an  increase  in 
motivated  behavior  and  if  basic  propriety  in  the  classification/ 
assignment  of  Marines  (competence  to  perform)  can  be  assumed, 
then  a  program  of  job  enrichment  should  materially  benefit  an 
organisation. 

Job  redesign  requires  knowledge  of  the  elements/indioies  of 
good  and  bad  jobs,  a  knowledge  of  situational  dynamics  and 
ability  t o  apply  that  knowledge  in  the  existing  workplace.  In 
short,  we  need  to  operate  from  a  theoretical  framework  from  which 
we  can  derive  principles  of  sound  job  redesign.  Such  a  program 
must  ultimately  support  the  Marine  Corps  proposition  that 
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responsibility  for  command  performance  rests  with  the  leader 
of  the  command.  Accordingly,  any  program  effort  must,  in  the 
final  analysis,  be  employed  by  those  who  lead — not  some 
external  agent  as  is  the  case  in  most  organizational  development 
strategies. 

Doctor  Frederick  Herzberg  of  the  University  of  Utah  is  the 
creator  of  Motivation-Hygiene  Theory  (M-H-T)  which  provides  a 
comprehensive  framswork  for  his  Orthodox  Job  Enrichment  (OJE) 
program.  I  content  that  this  theory  and  job  enrichment  philoso¬ 
phy  constitute  both  a  managerial  strategy  and  a  process  con¬ 
ducive  to  success.  It  is  a  strategy  that  involves  theory,  con¬ 
cepts  and  views  on  the  nature  of  man  and  a  direction  for 
management.  As  a  strategy  it  incorporates  the  dynamics  of  human 
relations,  communications,  administration)  salary  etc..  As  a 
process  it  involves  putting  theory  to  practical  use  via  the 
institution  of  good  job  ingredients  into  the  work  place. 

It  would  be  ludicrous  to  attempt  a  comprehensive  explanation 
of  Doctor  Herzberg' s  theory  and  applications  in  this  paper. 
However,  the  salient  points  of  the  theory  are  necessarily 
reviewed  since  they  provide  the  theoretical  underpinning  for 
understanding  behavior  and  attitudes,  situational  dynamics  and 
OJE  methodology. 

STATEMENT  OF  M-H-THEORV 

If  any  theory  is  to  be  consistently  applied  and  tested, 
then  its  premises  must  be  assumed  correct.  In  M-H-Theory 
one  must  assume  the  bi-dimensionality  underlying  the  feelings 
of  satisfaction  and  dissatisfaction  i.e.  the  dual  nature  of  man. 


861 


The  theoretical  basis  for  job  enrichment  begins  with  this 
particular  view  of  the  nature  of  roan.  Though  man  exists  at  all 
times  as  a  unity,  conceptually  we  view  him  as  having  two  distinct 
natures.  Each  nature  has  its  accompanying  need  system.  One 
nature — his  purely  biological  nature  is  similar  to  that  of  other 
animals— an  overriding  concern  to  avoid  hurt,  discomfort,  dis¬ 
satisfaction  from  the  environment  e.g.  the  work  around  him. 

This  avoidance  behavior  pertains  to  both  physical  and  psycholo¬ 
gical  situations  (cold,  heat,  danger,  loneliness,  feelings  of 
inferiority  etc.).  By  avoiding  environmental  situations  that 
can  cause  him  discomfort/pa in/hurt,  he  does  not  achieve  positive 
meaning  in  his  life.  He  merely  avoids  discomfort.  Because  man 
is  so  high  on  the  phylogenetic  structure,  he  possesses  an  almost 
infinite  capacity  to  learn  and  experience.  Accordingly,  the 
number  of  environmental  factors  that  can  cause  him  hurt  come  in 
an  infinite  variety  and  in  an  equally  infinite  member  of  shades 
e.g.  too  hit,  too  cold,  too  much  noise,  too  quiet  etc.. 

Herzberg  called  these  environmental  factors  that  man  is  faced 
with  on  the  job  as  well  03  in  everyday  life  "hygiene  factors" 
because  when  managed  properly  they  prevent  dissatisfaction. 

Other  authors  have  referred  to  them  as  maintenance  factors.  They 
are  extrinsic  to  the  person.  They  deal  with  the  issue  "How  well 
are  you  treating  me?".  Their  overriding  dynamic  is  avoiding  pain/ 
discomfort  from  the  environment.  These  factors  must  always  be 
maintained  at  a  reasonable  level,  for  significant  deficiencies 
in  any  maintenance  factor  causes  pain  and  lixe  a  headache— will 
impair  all  physical  and  psychological  processes  until  near 


homeostasis  is  reached.  Only  then  can  one  talk  about  motivated 
behavior . 

The  second  aspect  of  man’s  nature  deals  with  his  distinctively 
human  need  for  continuous  psychological  growth.  Unlike  other 
animals  who  are  primarily  preprogrammed  in  their  behavior  with 
a  capacity  to  learn  a  fcv  now  things,  man  is  born  very  dependent, 
is  nurtured  for  a  protracted  period,  but  *<-;hc  begins  learning  at 
birth  and  can  learn  new  things  until  biological  death.  When  man 
does  not  receive  cortical  stimulation  he  begins  to  display  that 
common  human  characteristic— boredom.  The  human  side  of  man 
then  demands  approach  behavior  in  a  quest  for  constant  psycho¬ 
logical  growth.  He  gains  from  experience.  If  he  doesn't  have 
these  growth  experiences  he  is  not  any  worse  off— he  merely 
gains  nothing— he  senses  an  emptiness  or  s  lack  of  fulfillment, 
lie  feels  no  satisfaction  in  this  work.  Conversely,  if  he  is 
experiencing  new  growth,  through  challenging  work,  he  experi¬ 
ences  feelings  of  personal  growth  and  worth,  feelings  of 
satisfaction  for  he  has  added  to  his  own  personal  growth  fiber. 

The  need  system  associated  with  the  humtn  part  of  people 
in  served  by  the  elements  of  the  job  itself.  These  factors 
that  lead  to  growth  of  people  are  termed  "motivators”.  These 
items  make  the  worker  want  to  do  his  job  because  it  meets  his 
human  need  to  grew  psychologically.  Meeting  these  needs  is 
the  spocific  managerial/leadership  challenge  we  face  today  in 
the  work  place.  When  met,  they  pay  big  dividends  for  an  organi¬ 
zation  as  well  as  the  person. 
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The  following  graphically  displays  the  two  continua 
inherent  in  M-H  Theory. 

Job  Dissatisfaction _ Biological  fr,  No  Job 

Nature  Dissatisfaction 

No  Job  DiitgatigfaffMon^  Unman  _,Toh  Satisfaction 

Nature 

The  following  diagram  reflects  the  motivation  and  hygiene 
factors  associated  with  the  theory  (arranged  in  normal  profile 
format) : 

INSERT  CHART  2  ABOUT  HERE 

Traditionally#  management  has  placed  almost  total  concurn 
with  meeting  the  maintenance  needs  of  people.  This  has  resulted 
in  people  movement  but  not  motivated  work.  In  movement  we  cause 
a  person  to  do  something  ho  wouldn't  ordinarily  do  by  using 
either  a  reward  or  a  threat.  When  someone  is  motivated  to  do  a 
job,  he  does  it  for  something  contained  in  the  job — he  turns  on 
his  own  internal  generator  that  doesn't  need  to  be  charged  by 
the  leader/manager.  These  motivator  needs  are  not  typically 
met  in  work  today.  That  is  why  we  need  to  restructure  so  many 
jobs. 

Before  proceeding  to  OJE  per  se,  let  me  summarise  the  dynamics 
associated  with  both  hygiene  and  motivator  factors.  Such  an 
analogy  clearly  distinguishes  the  two  continua  phenomena. 

INSERT  CHART  3  ABOUT  HERE 

OJE  ROOTS 

OJE  operates  from  a  motivated  behavior  relationship  drawn 
from  M-H  Theory, 

INSERT  CHART  4  ABOUT  HERE 
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The  first  ratio  of  ability  over  potential  (A)  relates  to  what 

(?) 

a  person  can  do;  not  what  he  is  like,  which  is  a  conaon  displace- 
aont  error.  The  more  ability  a  person  has,  the  more  he  can  be 
motivated  to  perform.  This  ratio  relates  to  personnel  selection 
and  classification;  getting  people  into  the  right  jobs  where  they 
can  develop  their  potential  and  abilities.  Concomitantly, 
training  programs  are  involved  here,  to  maintain  personal  develop¬ 
ment  in  our  rapidly  changing  technological  industry  which  tends 
to  frrce  rapid  obsolescence  of  talents,  techniques,  and  conse¬ 
quently,  people.  When  a  person  is  in  job  difficuxty,  look  to  his 
competence  first— "can  he  do  the  job?*'.  All  too  often  managers 
are  told  they  must  motivate  underachievers  when  in  reality,  the 
underachievers  lack  competence  to  perform.  Motivation  therefore 
becomes  a  slogan  to  prot-jct  ego  against  imcompetence . 

The  second  ratio,  opportunity  over  ability  (0) ,  determines  1 

.  TXT  j 

how  much  of  the  person’s  talent  is  allowed  to  show  itself.  No  S 

j 

one  can  be  motivated  to  do  a  good  job  unless  he  has  a  good  job 
to  do.  Since  attitudes  are  manifested  from  the  behavior  of 
people,  we  should  not  expect  enthusiastic,  motivated  attitudes 
in  people  who  are  given  fractionated,  dull,  repetitious,  "mickey 
mouse"  jobs.  Given  a  job  that  uses  only  a  small  fraction  of  a 
person ' s  ability /potential,  a  job  with  little  or  no  challenge 
or  opportunity  to  grow  and  to  achieve,  a  job  with  no  place  to 
grow  into  via  recognition  for  achievement  and  thus  advancement 
to  new  responsibility;  given  a  job  which  denudes  a  person  of  the 
chance  to  become  what  he  can  become,  then  that  person  will 
exhibit  behavior  and  attitudes  characteristic  of  a  person  in  a 
"sick"  job— a  job  that  needs  enrichment. 
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Reinforcement  of  motivators  with  more  motivators  and  the 
reinforcement  of  hygiene  for  hygiene  purposes  is  the  last  factor 
in  the  equation.  Hygiene  must  always  be  kept  reasonable.  Any 
significant  deficiences  must  be  alleviated  for  no  one  can  be 
motivated  while  ne  is  in  a  state  of  hygiene  deprivation. 

Relieve  the  hurt  and  then  proceed  to  motivate.  On  the  other 
hand,  when  satisfaction  is  being  attained,  when  people  have  a 
good  job  with  motivators  present,  they  must  receive  reinforce¬ 
ment  of  those  motivators.  Does  the  appraisal  system  reinforce 
growth  behavior  which  often  involves  risk,  challenge,  a  chance 
to  achieve  and  be  recognized,  does  it  offer  opportunity  to 
advance  and  achieve  more — to  become  "what  I  can  become"? 

OJE  APPROACH 

Any  attempt  to  reevaluate  a  "sick  job"  requires  a  recognition 
of  the  indicies/characteristics/symptoms  of  a  "bad  job"  as  well 
as  knowledge  of  the  ingredients  that  should  be  injected  into  the 
rejuvenated  job.  Briefly,  the  indicies  of  a  "Bad  Job"  are: 

INSERT  CHART  5  ABOUT  HERE 


Once  the  indicators  of  a  poor  job  are  recognized,  the  OJE 
ingredients  of  a  "good  job"  can  be  utilized  to  enrich  the  posi¬ 
tion  and  improve  the  motivational  aspects  of  the  work. 

INSERT  CHART  6  ABOUT  HERE 

Enriched  jobs  may  not  have  all  these  ingredients  but  a  con¬ 
scious  infusion  of  as  many  as  possible  of  these  enriching  ingre¬ 
dients  will  certainly  enhance  motivated  behavior.  These 
ingredients  are  derived  from  motivators  but  are  more  realistic 


and  practical  to  work  with  when  focusing  on  job  elements  then 
the  more  ambiguous  terms  of  achievement ,  growth  etc.  found  in 
the  standard  profile. 

Finally,  an  operating  framework  is  necessary  within  which 
the  job  enriching  motivators  can  be  properly  applied)  in  essence- 
the  OJE  work  process  must  be  understood.  Essentially,  OJE  is 
an  approach  which  requires  top  down  understanding  and  support 
while  the  application  starts  at  the  grass  roots  level  and  works 
up.  Diagrammatical ly  OJE  looks  like  thist 

INSERT  CHART  7  ABOUT  HERE 

The  job  is  analyzed  with  an  effort  to  vertically  stack 
mutilators  in  the  job.  To  do  this  we  "push  down*  some  cur¬ 
rently  higher  responsibilities,  to  the  job)  concomitantly  we 
incorporate  the  job's  pre  and  post  processes  which  fractionate 
jobs  so  often  (e.g.  prepare  own  preliminary  work  such  as  rough 
drafts  for  a  clerk  and  authorize  that  worker  to  sign  and  com¬ 
plete  the  work  task)  and  finally  strip  away  menial,  routine 
tasks  wherever  possible.  As  the  job  is  enriched,  redefinition 
of  higher  level  jobs  is  necessary.  These  higher  jobs  are  then 
reorganized  toward  better  management/supervision  functions  such 
as  planning,  training,  etc.. 

OJE  ISSUES 

As  stated,  the  strategy  of  orthodox  Job  Enrichment  is  to 
start  at  the  bottom  of  the  management  pyramid  and  work  up. 
First-line  supervisors  are  involved  in  enriching  the  jobs  of 
the  line  workers.  Later,  the  OJE  principles  are  applied  to 
the  jobs  of  the  first-line  supervisors,  and  so  on.  This 
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approach  is  necessary  because  enrichment  of  one  level  has  an 
unavoidable  impact  on  the  jobs  of  the  immediate  supervisors. 

Additionally,  the  bulk  of  the  work  force  is  on  the  line  and  this 
is  where  the  most  deflated  jobs  exist. 

i 

A  pivotal  person  in  the  enrichment  process  is  the  keyman. 

Each  organization  must  have  one  or  more  keymen.  Keymen  are 
individuals  from  within  the  organization  who  have  had  extensive 
training  in  job  enrichment  theory  and  application  and  have  a 
broad  knowledge  of  the  functions  and  operations  of  the  organiza¬ 
tion  within  which  they  operate.  In  the  OJE  process,  they  are 
involved  extensively  at  first,  but  to  a  decreasing  degree  as 
the  approach  becomes  more  of  an  accepted  and  adopted  strategy. 

Implementation  is  on  a  project  basis.  Projects  are  approved 
by  commanders.  The  first  phase  in  the  project  is  the  establish¬ 
ment  of  implementing  and  coordinating  committees.  The 
implementing  committee  is  comprised  of  the  supervisor  of  the 
area  to  be  enriched  (the  key  supervisor)  and  other  first  and 
second  level  supervisors  who  can  be  of  assistance  in  the  imple¬ 
mentation  of  job  enrichment.  Size  of  the  implementing  committee 
varies  from  four  to  eight.  The  coordinating  committee  is  made 
of  middle  and  upper  level  management  with  which  the  unit  involved 
interfaces.  Typically,  it  consists  of  four  to  eight  members. 

The  keyman  serves  as  advisor  to  both  groups. 

After  selection  of  the  committees,  the  keyman  conducts 
training  in  Motivation-Hygiene  theory.  Training  usually 
consists  of  30  to  40  hours  of  theory  including  classroom  exer¬ 
cises  to  highlight  the  major  points  of  the  instruction. 
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Following  the  training  effort,  the  implementing  group  has 
the  task  of  enriching  the  jobs  under  consideration.  The  tech¬ 
nique  used  is  called  nreenlighting  and  redlighting.  Green¬ 
lighting,  or  brainstorming,  utilizes  the  concept  of  deferred 
judgment.  The  purpose  is  to  generate  as  many  ideas  as  possible 
about  how  to  instill  motivators  into  the  jobs.  After  a  list 
has  been  green  lighted,  the  group  enters  a  redlight  session. 

Here  the  ideas  are  evaluated  to  determine  which  will  be  con¬ 
sidered  for  immediate  implementation,  which  will  be  kept  for 
possible  future  implementation,  and  which  will  be  discarded. 

A  time-phased  implementing  plan  is  developed. 

The  implementation  of  *:he  accepted  greenlight  items  is  the 
area  where  the  coordinating  committee  can  be  most  useful. 

Having  been  trained  in  the  theory,  this  committee  can  now  under¬ 
stand  the  strategy  behind  the  changes  suggested  by  the  implementing 
committee  and  be  of  assistance  in  removing  roadblocks  to  imple¬ 
mentation.  The  coordinating  committee  also  develops  the  measure¬ 
ment  plan  for  the  project.  Upper  level  management,  as  members 
of  the  coordinating  committee,  provide  important  assistance  to 
the  project  by  making  it  possible  for  the  key  supervisor  to  make 
the  changes  he  has  decided  are  needed.  They  will  also  need  to 
make  adjustments  to  their  management  strategy  as  more  and  more 
of  their  employees  have  their  jobs  enriched. 
qjE  SUMMARY  COMMENTS 

When  done  correctly  OJE  gives  the  worker  a  job  to  do  that 
provides  him  with  the  motivation  to  work  because  the  job  pro¬ 
vides  a  means  to  meet  his  human  need  for  psychological  growth. 
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The  supervisor/leader  is  now  able  to  return  to  doing  supervisory 
functions  and  get  away  from  b&bysetting  functions.  He  will  be 
managing  workers  who  have  responsible  jobs.  The  leader  can 
differentiate  between  "things"  that  prevent  dissatisfaction  and 
the  "things"  that  promote  satisfaction.  Higher  levels  of 
management  reap  the  benefits  of  motivated  performance  from 
lower  level  performers. 

OJE  is  a  powerful  management  strategy  to  be  sure— -but  it 
must  be  kept  in  perspective.  It  is  not  a  panacea  for  manage¬ 
ment  ills;  it  is  not  easy  to  do— it  is  work;  healthy  work 
possibilities  must  exist  i.e.  you  cannot  motivate  people  to  do 
meaningless  work;  it  does  not  offer  a  cookbook  solution  to  the 
problem  of  motivation  and  to  work.  Finally  it  must  be  accepted 
as  a  personal  strategy  by  the  leader  or  it  is  doomed  to  failure 
from  the  onset. 

OJE  AMD  THE  MAKINE  CORPS  (A  PROPOSAL) 

Since  M-H  Theory  and  OJJ.  should  become  a  way  of  thinking 
and  a  way  of  leading  and  not  an  externally  imposed  edict  to 
improve  this  or  that  index,  it  seems  obvious  that  four  events 
must  take  place.  First,  our  top  level  leadership  e.g.  the 
Commandant,  DC/S  for  Manpower,  CG  of  FMFLant/Pac  and  Division 
of  Wing  commanders  must  sense  a  need  for  it  and  overtly  decide 
to  pursue  the  effort.  We  must  "buy  in"  all  the  way  or  leave 
it  alone.  Next,  we  must  gain  a  sense  of  direction.  What  are 
we  after?  Third— where  do  we  establish  a  test  project  and 
finally,  how  do  we  institutionalise  a  program. 
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TOP  LEVEL  LEADERSHIP 


Each  of  the  leadership  echelons  mentioned  above  have  or 
will  soon  be  briefed  on  theory,  concepts  and  methodology. 

To  date,  each  has  shown  a  positive  propensity  to  proceed  with 
a  test  project.  Only  with  their  executive  management  support 
can  upward  changes  be  accepted  as  well  as  assets  dedicated 
to  the  effort.  Additional  briefings  will  be  required  if  a 
project  is  approved. 

WHAT  ARE  WE  AFTER 

Combat  readiness  through  well  trained,  highly  motivated 
Marines;  professionalism  is  our  goal.  Emphasis  must  be  on 
people.  Are  we  providing  realistic,  challenging  training? 

Are  our  people  responsive  to  the  training  (productivity)?  Can 
we  improve  their  perspective  of  themselves,  their  Corps,  their 
mission? 

Traditionally  we  have  looked  at  motivation/morale  indicias 
iike  U/A  and  desertion  rates,  attrition/reenlistment  rates, 
racial  incidents,  enthusiasm/unit  cohesion  and  most  important— 
how  the  job  is  done  in  the  field.  These  are  still  a  reflection 
of  the  unit  and  they  should  remain  as  interim  indicies.  New 
indicies  may  be  discovered  as  any  project  progresses.  Regardless 
of  the  index,  we  are  Jooking  for  that  elusive  quality  of 
"attitudinal  change"  that  improves  combat  readiness.  At  this 
point,  I  do  not  feel  we  should  look  for  monetary  gains  which 
may  well  be  fallout  or  even  a  new  index  in  a  project.  This 
could  be  especially  so  when  we  work  in  technical  fields. 


Admittedly,  some  of  these  sought  indicies  are  "soft"  and  diffi¬ 
cult  to  measure.  Again,  my  faith  in  the  attitudinal-affective 
results  that  are  a  mainstay  of  M-H  Theory  come  into  play.  Go 
and  ask  the  Narine— he  will  tell  you  what  no  questionnaire  can 
aver  do— give  you  feelings,  impressions  expressed  in  a  real 
setting,  and  he  can  articulate  changes  he  perceives  over  tin  .  I 
believe  this  "flavor"  in  personal  perceptions  ia  imperative  to 
the  results  we  hope  to  achieve. 

WHERE  TO  ESTABLISH  A  TEST  BED 

1  believe  we  must  build  on  the  basic  experience  of  those  who 
have  already  tried  this  approach— the  Air  Force  Air  Logistics 
Command  (ALC)  effort  which  has  realized  such  success.  Accordingly 
I  am  recommending  we  establish  two  test  projects— one  technical 
project  with  an  aviation  squadron  and  one  project  in  an  infantry 
battalion— our  cutting  edge— a  place  most  difficult  to  produce 
high  visibility  results  but  a  place  where  attitudinal  changes 
can  mean  so  much  in  the  execution  of  mission.  Improvements 
in  this  latter  area  will  have  a  tremendous  impact  on  our  entire 
leadership  package,  which  must  ultimately  become  the  vehicle 
to  institutionalize  the  entire  job  satisfaction-OJE  effort. 
INSTITUTIONALIZATION 

Once  the  effort  at  the  project  level  is  deemed  a  success, 
we  must  then  incorporate  the  methodology  into  our  school  system. 

It  may  well  provide  improved  boundaries  and  foundation  of  our 
current  leadership  package  which  is  heavy  in  human  relations 
activities.  This  new  concept  places  human  relations  into  true 
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perspective— a  necessary  hy genic/maintenance  factor  (interper¬ 
sonal  relations)  within  the  total  leadership  sphere.  Accordingly , 
teaching  M-H  Theory  and  OJE  as  a  leadership/managerial  strategy 
and  process  simply  subsumes  our  current  program  with  a  more  com¬ 
prehensive,  substantive  effort  in  an  understandable  framework. 
Teaching  M-H  Theory  at  the  SNCO  Academy,  The  Basic  School, 
Amphibious  Warfare  School  (intermediate  level)  and  Command  and 
Staff  College  is  necessary.  The  leadership  is  then  established 
for  implementation  on  the  jobs  these  leaders  inherit.  Finally, 
people  and  work  issues  are  kept  contemporary  via  the  current 
20  hour  leadership  package  each  Marine  participates  in  annually. 
The  institutionalization  mechanism  is  in  place  now.  Success 
in  projects  undertaken  should  fuel  the  fire  of  rapid  expansion. 

Finally  what  will  be  tne  modus  operandi  in  project  execution? 
The  following  is  offered  in  response  to  that  question  and  in 
summary  of  this  paper: 

1.  Continue  to  educate  down  first  to  better  facilitate 
changes  originating  at  grass  roots  level.  Many  more  "briefs 
of  chiefs"  must  be  conducted.  A  seminar  of  3ome  8-10  hours  is 
in  order  covering  theory,  application,  and  executive  level 
expectations. 

2.  Identify  the  two  project  organizations.  This  is  easy. 
Volunteers  abound.  Sever?.!  units  have  already  attempted  a 
piecemeal  effort  and  are  overwhelmed  with  positive  response. 

3.  Select  key-people  for  a  100-120  hour  education  process 
under  Doctor  Herzberg's  tuteiedge.  These  people  must  come  from 
the  project  units— be  flexible,  knowledgeable  in  unit  functions/ 
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activities*  each  must  display  a  capacity  for  managerial  growth. 
Included  in  this  croup  must  also  be  a  group  of  three  or  more 
people  from  HQ  Marine  Corps  Manpower  who  will  subsequently  aid 
in  the  institutionalization  process  as  well  as  provide  guidance 
for  new  projects  if  any  are  desired.  Doctor  Herzberg  will  act 
only  as  a  consultant  on  request  by  this  time.  The  keyman  is 
just  that~the  heart  of  the  program.  His  selection  is  paramount 
to  success. 

4.  Key-men  can  then  conduct  briefings  for  Executive  Groups 
(if  any)  and  Coordinating  Groups  usually  4-6  people. 

INSERT  CHART  8  ABOUT  HERE 

These  are  the  middle  supervisors;  the  commanders  and/or 
staff  who  approve  and  support  the  efforts  of  the  implementing 
groups . 

5.  Implementing  groups  whore  the  work  is  done.  These  6  to 
8  people  are  the  key  supervisors--lieutenants,  platoon  sergeants 
or  squad  leaders.  These  are  the  people  who  know  and  represent 
the  jobs  to  be  enriched.  Key*men  train  these  people  in  detail. 

This  group  them 

a.  Blocks  out  the  job  and  flow  charts  them. 

b.  Identify  bad  job  characteristics. 

c.  Greenlight  changes. 

d.  Redlight  changes  (good  job  characteristics  installed) 

e.  Develop  implementing  plans  for  approval. 

f.  Work  with  Coordinating  Group  on  a  milestone  chart 
for  the  implementing  plan. 

g.  Determine  at  least  interim  indicies  for  both 
measurement  and  feedback. 
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6.  Execute  the  plan: 

a.  Measure  results. 

b .  Provide  feedback . 

c.  Refine  and  modify  indicies. 

By  utilizing  the  above  approaches  and  measures  we  should  be 
able  to  track  and  evaluate  the  success  and  progress  of  our 
OJE  effort  at  all  levels. 
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CHART  2 

MOT  I  VAT  I ON-HY  6IEI4E  THEORY 


(1)  F*CT0RS  ^  ARRAM6ED  IN  ORDER  OF  FREQUENCY  OF  OCCURANCE  HOT 

(2)  MOTIVATORS *ARE  ALSO  IN  ORDER  OF  FREQUENCY  AND  INVERSE  ORDER  OF  IMPORTANCE 
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CHART  7 
ENRICHED  JOB 
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CHART  8 

ORGANIZATION  FOR  ORTHODOX  JOB  ENRICHMENT 


Si  tfg&s&gftfe&ta k£ 


(VEFVIEW  OF  W  US  ARMY  JCB  SATISFACTION 
AID  RETENTION  PROJECT 

m  dennis  j.  w 
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JOB  SATISFACTION  AND  RETENTION 


MY  PRESENTATION  IS  AN  OVERVIEW  OF  THE  ARMY'S  JOB  AND 
CAREER  SATISFACTION  PROJECT,  MY  INTENT  IS  TO  PROVIDE  A 
CONCEPTUAL  AND  FACTUAL  FRAMEWORK  SO  YOU  CAN  BETTER  UNDER¬ 
STAND  AND  EVALUATE  OUR  PROJECT  AND  PRESENTATIONS  THIS 
MORNING, 

I  WILL  PRESENT  THE  OVERVIEW  IN  THREE  PARTS:  THE 
CONTEXT  IN  WHICH  THE  PROJECT  BEGAN,  THE  INTENDED  USES 
OF  THE  DATA,  AND  THE  COMPONENTS  OF  THE  PROJECT, 

I  WOULD  PREFER  THAT  THIS  BE  MORE  OF  AN  INFORMAL 
DISCUSSION  THAN  A  HIGHLY  FORMAL,  LECTURE-LIKF  PRESENTA¬ 
TION,  SO  IF  YOU  HAVE  ANY  QUESTIONS  OR  COMMENTS,  PLEASE 
TOSS  THEM  OUT  AS  THEY  ARISE. 

CQNIEXI 

IN  JUNE  1976,  MILPERCEN  BEGAN  EXPANDING  THE  JOB 
SATISFACTION  PORTION  OF  THE  OCCUPATIONAL  SURVEY  PROGRAM. 
THE  PURPOSE  OF  THE  EXPANSION  WAS  TO  BETTER  UNDERSTAND 
HOW  SATISFACTION  WITH  ONE'S  ARMY  JOB  AND  MILITARY  LIFE 
AFFECTS  THE  DECISION  TO  STAY  OR  LEAVE  THE  SERVICE,  WE 
ESPECIALLY  HAVE  BEEN  INTERESTED  IN  THE  RELATIONSHIP 
BETWEEN  JOB  AND  CAREER  SATISFACTION  AND  FIPST-TERM 
REENLISTMENTS. 

THE  MILPERCEN  PROJECT  IS  PART  OF  THE  ARMY'S  INTEN¬ 
SIFIED  EFFORT  TO  GAIN  ADDITIONAL  INSIGHTS  INTO  RETENTION, 
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JOB  SATISFACTION,  AND  THE  ALL-VOLUNTEER  ARMY,  THE 
ULTIMATE  OBJECTIVE  IS  TO  IMPROVE  THE  ARMY'S  ABILITY  TO 
RECRUIT  AND  RETAIN  A  SUFFICIENT  NUMBER  OF  QUALITY  SOLDIERS. 

AS  TUTTLE  AND  HAZEL  HAVE  POINTED  OUT,  MOST  OF  THE 
RESEARCH  AND  APPLICATIONS  OF  JOB  SATISFACTION  HAVE  OCCURRED 
IN  INDUSTRY  (TUTTLE  AND  HAZEL,  1974).  HOWEVER,  WITHIN  THE 
LAST  TEN  YEARS  THE  MILITARY  SERVICES  HAVE  BEGUN  TO  APPLY 
THE  RESEARCH  FINDINGS  FROM  THE  PRIVATE  SECTOR  AND  TO 
SPONSOR  THEIR  OWN  RESEARCH  IN  THIS  AREA,  MUCH  OF  MILPER- 
CEN'S  WORK  BUILDS  ON  RELATED  RESEARCH  BY  THE  ARMY  RESEARCH 
INSTITUTE  (ARI)  AND  THE  AIR  FORCE'S  HUMAN  RESOURCES 
LABORATORY  (AFHRL) .  THE  JOB  AND  CAREER  SATISFACTION  PRO¬ 
JECT  MOST  CLOSELY  RESEMBLES  THE  AIR  FORCE'S  APPROACH 
(ALLEY  AND  GOULD,  1975),  DR,  LARRY  GOLDMAN  WILL  EXPLAIN 
MORE  ABOUT  THIS  IN  HIS  PRESENTATION, 

ALTHOUGH  OCCUPATIONAL  ANALYSIS  BEGAN  SYSTEMATICALLY 
ARMY-WIDE  IN  1968,  THE  JOB  SATISFACTION  PORTION  WAS  NOT 
ADDED  TO  THE  OVERALL  SURVEY  PROGRAM  UNTIL  PALL  1974, 

TWENTY  ITEMS  WERE  USED  TO  OPERATIONALLY  DEFINE  AND 
EMPIRICALLY  MEASURE  SATISFACTION  WITH  ONE'S  ARMY  JOB  AND 
WITH  MILITARY  LIFE,  THE  OPERATIONAL  DEFINITIONS  OF  JOB 
AND  CAREER  SATISFACTION  WERE  BASED  ON  THE  HYGIENE  AND 
MOTIVATOR  FACTORS  THAT  FREDERICK  HEPZBERG  IDENTIFIED  IN 
HIS  RESEARCH  ON  JOB  SATISFACTION  (HERZBERG,  "AUSNER,  AND 
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SNYDERMAN.  1959);  THIS  VUGRAPH  (SEE  FIGURE  A  BELOW) 
LISTS  HERZBERG'S  MOTIVATOR  AND  HYGIENE  FACTORS.  THE 
ASTERISKED  VARIABLES  ARE  THOSE. FOR  WHICH  SPECIFIC 
OCCUPATIONAL  SURVEY  ITEMS  HAD  BEEN  WRITTEN, 


FIGURE  A 


HERZBERG'S  MOTIVATORS 

ACHIEVEMENT 

•  RECOGNITION 
WORK  ITSELF 

•  RESPONSIBILITY 

•  ADVANCEMENT 

•  GROWTH 


HERZBERG'S  HYGIENE  FACTOPS 

COMPANY  POLICY  i  ADMINISTRATION 

*  SUPERVISION 

*  RELATIONSHIP  WITH  SUPERVISOR 

*  WORK  CONDITIONS 

*  SALARY 

*  RELATIONSHIP  WITH  PEERS 

*  PERSONAL  LIFE 
RELATIONSHIP  WITH  SUBORDINATES 

*  STATUS 
SECUITY 

ALTHOUGH  THE  HERZ3ERG-SASED  ITEMS  WERE  ADEOUATE  FHR 
OCCUPATIONAL  ANALYSIS.  WE  DECIDED  TO  DISCARD  THIS  APPROACH 

IN  FAVOR  CF  ONE  THAT  WAS  DESIGNED  TO  HELP  PREDICT  CAREER 
DECISIONS,  WORK  ATTITUDES,  AND  DUTY  PERFORMANCE  (ALLEY  AND 
GOULD,  1975). 
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FACTION  DATA  WERE  TO  IDENTIFY  THE  RELATIVE  DEGREE  OF 
SATISFACTION/DISSATISFACTION  AMONG  DIFFERENT  OCCUPATIONAL 
GROUPS  AND  TO  AMPLIFY  ON  RELATED  DATA  COLLECTED  IN  OTHER 
PARTS  OF  THE  OCCUPATIONAL  SURVEY  QUESTIONNAIRE,  THE 
UNDERLYING  PURPOSE  OF  THESE  EFFORTS  WAS  TO  PROVIDE  OCCU¬ 
PATIONAL  INFORMATION  TO  THE  ARMY'S  TRAINERS  AND  OCCUPA¬ 
TIONAL  STRUCTURES  PEOPLE, 

IN  REASSESSING  THE  INTENDED  USES  OF  THE  JOB  AND  CAREER 
SATISFACTION  PROJECT,  TWO  CRITERIA  WERE  FOLLOWED: 

-  EXPANDED  USES  WOULD  BE  BASED  ON  THE  CURRENT  AND 
FUTURE  NEEDS  OF  KEY  ARMY  DECISION-MAKING  AGENCIES, 

-  THE  PROJECT  WOULD  BE  LINKED  TO  OTHER  RELATED  RE¬ 
SEARCH  AND  STUDY  EFFORTS  IN  THE  ARMY,  OTHER  SERVICES,  AND 
INDUSTRY. 

CONSEQUENTLY,  CONSIDERABLE  TIME  AND  EFFORT  WAS  DEVOTED  TO 
DETERMINING  THE  NATURE  AND  SCOPE  OF  OTHER  ON-GOING  WORK 
AND  IDENTIFYING  THE  NEEDS  OF  KEY  AGENCIES, 

THE  OUTCOME  OF  THIS  REASSESSMENT  WAS  AN  EXPANDED  LIST 
OF  INTENDED  USES,  THIS  VUGRAPH  (SEE  FIGURE  B  BELOW)  SHOWS 
THE  EXPANDED  USES. 

FJG11RL1 

EXAMINE  RELATIONSHIP  BETWEEN  JOB  SATISFACTION  AND: 

-  RETENTION 

-  MORALE 

-  OCCUPATIONAL  MISMATCH 

-  EFFECTIVE  USE  OF  TRAINED  ASSETS 

-  SELECTED  STUDIES  (e.g,  WOMEN  IN  THE  ARMY) 
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JOB  AND  CAREER  SATISFACTION  SURVEYS 

OUR  PROJECT  CONSISTS  OF  FOUR  PROJECT  ELEMENTS;  THREE 
ARMY-WIDE  ATTITUDE  SURVEYS  AND  ONE  OCCUPATIONAL  SURVEY. 

I  SHALL  EXPLAIN  EACH  OF  THESE  BRIEFLY. 

If!  AUGUST  76  A  SAMPLE  SURVEY  FOCUSED  ON  FIRST-TERM 
SOLDIERS  IN  GRADES  E-3  AND  E-9,  !S*  iOXJMATELY  3,700 
HRST-TERMERS  RESPONDED  TO  38  OIIESTIO  .S  ABOUT  JOB  SATIS- 
FACTiON  AND  REENLISTMETT  INTENT,  THE  RESULTS  HERE  PUB¬ 
LISHED  AS  A  SURVEY  REPORT  IN  MAY  1977.  THE  REPORT  IS 
ENTITLED  'JOB  SATISFACTION  AND  REENLISTMENT  INTENT  FOR 
FIRST-TERM  PERSONNEL:  INITIAL  FINDINGS.* 

IN  FEBRUARY  OF  THIS  YEAR  V!E  RANDOMLY  SAMPLED  APPROXI¬ 
MATELY  9,000  FIRST-TERM  AND  CAREER  ENLISTED  MEN  AND  WOMEN. 
THE  SURVEY  CONSISTED  IF  30  ITEMS  ABOUT  JOB  SATISFACTION 
AND  CAREER  DECISIONS.  THE  FEBRUARY  SURVEY  WAS  AN  ABBRE¬ 
VIATED  VERSION  OF  THE  AIR  FORCE'S  OCCUPATIONAL  ATTITUDE 
INVENTORY.  IT  WAS  MODIFIED  TO  ACCOUNT  FOR  THE  DIFFERENCES 
BETWEEN  THE  ARMY  AND  AIR  FORCE.  THE  RESULTS  OF  THIS 
SURVEY  WILL  BE  PUBLISHED  NEXT  MONTH.  MB,  DARRELL  WORSTINE 
IN  HIS  PRESENTATION  WILL  HIGHLIGHT  SC-ME  OF  THE  MAJOR 
FINDINGS  OF  THE  FEBRUARY  77  SURVEY. 

IN  MARCH  OF  THIS  YEAR  WE  SENT  OCCUPATIONAL  SURVEY 
QUESTIONNAIRES  TO  1,725  RECRUITERS  AND  CAREEP  COUNSELORS 
(NOS  OOE).  THIS  SURVEY  WILL  SPV1DE  INSIGHT  INTO  THEIR 
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PERCEPTIONS  ABOUT  THE  CAREER  MOTIVATION  OF  ENLISTED 
SOLDIERS*  THE  RESULTS  OF  THE  SURVEY  WILL  BE  PUBLISHED 
EARLY  NEXT  YEAR. 

AT  THE  END  OF  THIS  MONTH,  WE  WILL  RANDOMLY  SURVEY 
APPROXIMATELY  30,300  FIRST-TERM  AND  CAREER  ENLISTED  MEN 
AND  WOMEN.  THESE  SOLDIERS  WILL  BE  ADMINISTERED  A 
COMPREHENSIVE  JOB  AND  CAREER  SATISFACTION  OUESmiNAIRE 
(APPROXIMATELY  300  ITEMS).  THE  QUESTIONNAIRE  REPRESENTS 
THE  CULMINATION  OF  MORE  THAN  ONE  YEAR  OF  DEVELOPMENTAL 
WORK.  THE  RESULTS  OF  THE  OCTOBER  1977  SURVEY  WILL  BE 
PUBLISHED  BY  JUNE  OR  JULY  1978. 


THE  AUGUST  1976  AFMY-WIDE  SURVEY  AM)  THE  APRIL  1977  PILOT  TEST 

DR  LAWRENCE  A,  GOLDMAN 


I  The  August  1976  Army-Wide  Survey 
A.  klRCCUCTIQN 

In  August  1976,  an  80  item  Army  Quarterly  Sample  Survey  was 

DISTRIBUTED  TO  A  RANDOM  SAMPLE  OF  PERSONNEL  ARMY-WIDE.  SINCE  ITEMS 
INCLUDED  IN  THIS  SURVEY  WERE  FINALIZED  PRIOR  TO  INITIATION  OF  THE  JOB 
SATISFACTION/REENLISTMENT  INTENT  PROJECT,  THIS  QUESTIONNAIRE  WAS  NOT 
DESIGNED  TO  BE  A  COMPREHENSIVE  INSTRUMENT  FOR  MEASURING  SPECIFIC 
FACTORS  INFLUENCING  SOLDIERS'  ATTITUDES  TOWARDS  THESE  TWO  CRITERION 
MEASURES i  IN  PARTICULAR,  COVERAGE  OF  p ACTORS  WITH  THE  POTENTIAL  OF 
INFLUENCING  REENLISTMENT  WAS  INCOMPLETE,  JOB  SATISFACTION  WAS  ADDRESSED 
PRIMARILY  BY  17  INDEPENDENT  FACTORS  DEVELOPED  BY  THE  MILITARY 
Occupational  Development  Division,  for  which  data  have  been  collected 

SYSTEMATICALLY  FROM  ENLISTED  PERSONNEL  THROUGH  MILITARY  OCCUPATIONAL 

Data  Bank  questionnaires  since  September  1974,  To  extend  the  domain 

OF  MEASUREMENT  iNTO  OTHER  AREAS  BELIEVED  TO  INFLUENCE  JOB  SATISFACTION 
AND  POSSIBLY  REENLlSTKENf,  AN  ADDITIONAL  21  ITEMS  INCLUDED  IN  THIS 

Quarterly  Sample  Survey  (not  be  personcl  affiliated  with  th.s  ’’roject) 

VCRE  ALSO  CONSIDERED  IN  THE  ANALYSIS,  THE  MAJOR  DEFECTS  IN  THE 
COVERAGE  OF  REENLISTMENT  RELATED  FACTORS  IN  THE  AUGUST  1976  QUESTION¬ 
NAIRE  HAVE  BEEN  REDUCED  TO  A  CONSIDERABLE  EXTENT  IN  THE  FEBRUARY  1977 

Army-wide  survey  to  be  discussed  by  to,  Worstine,  As  Major  Hupp 

IfCICATED  IN  HIS  INTRODUCTORY  REMARKS,  ANALYSIS  OF  T>€  AUGUST  1976  SURVEY 
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WAS  CONDUCTED  ON  APPROXIMATELY  3,700  PERSONAL  IN  PAYGRADES  E-3  AND 
E-4  WHO  WERE  IN  THEIR  INITIAL  TERM  OF  0i.ISTT®rr. 

B.  SiflNirtriHT  FiwmMfis  mud  fattisinas 

1.  Interesting  work  was  identified  as  the  best  predictor  of  both 

REENLISTMENT  INTENT  AND  JOB  SATISFACTION,  AT  LEAST  FOR  FIRST  TERM 

pERsoNNa.  This  finding  was  noted  for  E-3's  as  vcll  as  E-4's,  males 

AND  FEMMES,  NON-HIGH  SCHOOL  GRADUATES  AND  HIGH  SCHOOL  GRADUATES,  WHITES 

and  Blacks,  and  single  and  married  personnel,  From  this  it  could  be 

INFERRED  THAT  THE  EXTENT  TO  WHICH  SOLDIERS  PERCEIVE  THEIR  VflRK  TO  BE 
INTERESTING  STRONGLY  If^LUENCES  OVERALL  SATISFACTION  WITH  THEIR  JOB  AND 
THEIR  INTENTION  TO  REENUCT.  THIS  IS  INDEED  SIGNIFICANT  SINCE  IT 
IMPLIES  THAT  A  NOfHOCTARY  FACTOR  PLAYS  A  GREATER  ROLE  IH  JOB 
SATISFACTION  AND  REENLISTMENT  INTENT  THAN  MILITARY  FAY,  ALLOWANCES 
AND/OR  BENEFITS,  A  BELIEF  COMMONLY  SHARED  BY  MANY  CIVILIWS  AS  WELL  AS 
INDIVIDUALS  WITHIN  TIE  ARMY.  WlTH  THE  CONCOMITANT  RESPONSIBILITY  OF 
THE  APW  TO  REDUCE  OVERALL  PERSONNEL-RELATED  COSTS  WHILE  INCREASING 
THE  RETENTION  RATE  OF  ELIGIBLE  PERSONNEL,  ESPECIALLY  UNDER  THE  ALL 

Volunteer  Force,  attempting  to  make  jobs  more  appealing  may  alleviate 


2.  Work  importance,  work  challenge,  and  working  association  with 

CHE'S  SUPERVISORS  V€RE  RF  MIVELY  CONSISTENT  PREDICTORS  OF  JOB 
!  SATISFACTION  IN  TERMS  OF  GRADE,  SEX,  EDUCATIONAL  LEVEL,  RACE,  AND 

*  MARITAL  STATUS.  In  OTHER  WORDS,  A  SOLDIER  PERCEIVES  A  JOB  TO  BE 

l  SATISFYING  IF  HE/SHE  BELIEVES  THAT  IT  IS  TRULY  SUBSTANTIAL  WHIUE 

l 

i 
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ESSENTIALLY  FREE  FROM  "HARRASSMENT"  BY  HIS/HER  SUPERVISOR, 

3.  Soldiers  who  believed  they  were  given  accurate  information 
by  an  Army  recruiter  had  higher  rebusttcnt  intentions  and  greater 

JOB  SATISFACTION  THAN  THOSE  WHO  DIDN'T.  SOLDIERS  VHO  FEEL  THEY  WERE 
GIVEN  ACCURATE  INFORMATION  BY  Hi  fm  RECRUITER  PROBABLY  ENTER  THE 
ARW  WITH  A  MUCH  CLEARER  IDEA  OF  WHAT  TO  EXPECT  FROM  ARMY  LIFE.  THIS 
DOES  NOT  IFfLY  THAT  AfW  RECRUITERS  EITHER  TRULY  REPRESENTED  OR 
MISREPRESENTED  T>€  FACTS  ABOUT  ARMT  LIFE.  WHAT  THIS  REPRESENTS  IS  THE 
EXTENT  TO  VHICH  THE  EXPECTATION  OF  THE  INDIVIDUAL  CORRESPONDED  TO  THE 
INFORMATION  IMPARTED  TO  HIGHER  BY  THE  tow  RECRUITER.  THIS,  IN  TURN, 
MAY  BE  A  PRIMARY  REASON  FOR  THE  INCREASED  LIKELIHOOD  THAT  HE/SHE  HAS  A 
SIGNIFICANTLY  HIGHER  PROPENSITY  TO  REENLIST  AND  JOB  SATISFACTION. 

II  The  April  1977  Pilot  Test 
A.  Iktrmtion 

In  the  early  planning  PHASE  OF  the  OVERALL  PROJECT,  SEVERAL  WEEKS  WERE 
SPENT  DISCUSSING  TfC  NATURE  AND  SCOPE  OF  AN  ARM/  JOB  AND  CAREER 
SATISFACTION  MODEL.  The  ORIGINAL  INTENT  WAS  TO  DEVELOP  A  MODEL  AND 

test  it.  However,  because  of  tifc  and  manpower  constraints,  it  was 

DECIDED  TO  CAPITALIZE  ON  THE  EXTENSIVE  LITERATURE  REVIEW  AND  LONG-RANGE 
RESEARCH  CONDUCTED  BY  THE  US  AlR  FORCE  HUMAN  RESOURCES  LABORATORY 
CONCERNING  JCB/CAREER  SATISFACTION,  NOT  ONLY  DID  THE  AlR  FORCE'S 
EFFORTS  PARALLEL  MILPERCEN's,  BUT  TV€IR  RESEARCH  PROGRAM  WAS  CLOSELY 
LINKED  TO  AND  BASED  UPON  AN  OCCUPATIONAL  ANALYSIS  PROGRAM  AND  ONE  OF 
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THE  MAIN  PURPOSES  OF  THEIR  RESEARCH  V  AS  TO  STUDY  THE  RELATIONSHIP 
BETVCEN  JOB  SATISFACTION  AND  RETENT. X  (REENLISTTCNT)  .  SINCE  THE 

Air  Force  concluded  t>cre  were  no  adequate  job  sathfacticn  :*easurdcnt 

INSTRUMENTS  FOR  USE  IN  TIC  MILITARY  ENVIRONMENT,  AN  OCCUPATIONAL 

Attitide  Inventory  (OAI)  was  developed.  In  the  initial  development  of 
the  QAI,  36  potential  satisfaction  dimensions  (also  referred  to  as 
hypothesized  factors)  were  identified  by  Air  Force  behavioral  scientists 
familiar  with  tic  military  work  environment,  Items  were  written  for 
EACH  DIMENSION,  RESULTING  IN  a  FINAL  pool  OF  348  ITEMS  OF  APPROXIMATELY 
ID  ITEMS  PER  DIMENSION,  To  VALIDATE  THESE  HYPOTHESIZED  FACTORS,  A 

random  samue  of  3,000  first  term  airmet:  was  analyzed. 

In  the  initial  DEVEuow«fr  of  an  Army  pilot  test  questionnaire,  all  of 
tic  Air  Force's  hypothesized  factors  were  used  except  "Personal  Growth 

AND  DEVELOPMBTT",  "INDEPENDENCE"  AND  "UNCLASSIFIED".  In  ADDITION  TO 
ADOPTING  THE  BASIC  AlR  FORCE'S  DIMENSIONS,  FOUR  FACTORS  V€RE  HYPOTICSIZED 
FOR  THE  PILOT  TEST  "FAMILY",  "INDIVIDUAL",  "DISCRIMINATION*,  AND  ’ARMY 

Unique".  These  four  additional  factors,  based  on  the  adult  develops 

RESEARCH  OF  DANIEL  IfVINSON  AND  ROGER  GOULD,  ARE  BELIEVED  TO  BE 
IM’ORTAMT  INFLUENCES  ON  A  PERSON'S  MOTIVATION  AND  BEMVIOR.  THE 
FACTORS  HYPOTHESIZED  BY  THE  ARMY  AND  THE  NUMBER  OF  ITBC  ASSOCIATED 
WITH  EACH  FACTOR  ARE  SHOW  IN  THE  TABLE, 


Insert  the  Table  about  «re 


TABLE 


AIRFORCE 

m 

FACTOR  DESCRIPTOR 

ESI 

m 

FACTOR  DESCRIPTOR 

AOUEVBENT 

7 

Achievement 

2 

Activity 

8 

Activity 

A 

Air  Force  aw  Unit 

Policies  and  Practices 

18 

Arw  amj  Unit 

Policies  aw  Practices 

17 

Assignment  Locality 

17 

Assignment  Locality 

16 

Authority 

A 

Authority 

3 

Co-workers 

9 

Co-workers 

12 

Creativity 

10 

Creativity 

5 

Importance 

8 

Imswtance 

2 

Interest 

9 

Interest 

A 

Knowledge  of  Results 

7 

Knowledge  of  Results 

3 

Personal  Grom  and 
Development 

9 

— 

Job  Design 

10 

Job  Design 

3 

/Optional  Social  Contact! 
(Required  Social  Contact] 

I 

Social  Contact 

11 

Ray  and  Benefits 

12 

Pay  aw  Benefits 

8 

ftosicAL  Work  Environment 

13 

Physical  Work  Ewironknt 

9 

tocMornoN  Opportunity 

8 

Rwmotion  Opportunity 

A 

Recognition 

9 

Recognition 

A 

Responsibility 

10 

Responsibility 

A 

Independence 

9 

; 

Value  of  Experience 

8 

Value  of  Military  Experience 
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AIRFORCE 

m 

BOOR  DESCRIPTOR 

ii 

BOR  DESCRIPTOR  □f® 

tovsicAL  Safety 

6 

Physical  Safety 

3 

Economic  Security 

4 

Economic  Security 

2 

Service  to  Otters 

8 

Service  to  Others 

1 

Social  Status 

11 

Social  Status 

5 

Sufficiency  of  Training 

12 

Sufficiency  of  Training 

10 

Supervision  Received  - 
Human  Relations 

15 

Bwan  Supervision 

15 

Supervision  Received  - 
Technical 

9 

Technical  Supervision 

5 

Performance  Evaluation 

8 

F^rfcrmance  Evaluation 

3 

Job  Change 

7 

Job  Change 

4 

Tools#  Equiptcnt  and 
Supplies 

8 

Tools#  Equipment#  and 
Supplies 

7 

Utilization 

8 

Utilization 

3 

Variety 

9 

Variety 

4 

Work  Schedule 

15 

Work  Schedule 

6 

Supervisory  Duties 

18 

Supervisory  Duties 

10 

Declassified 

8 

INDIVIDUAL 

14 

Army  Unique 

10 

Discrimination 

9 

Family 

18 

TOTAL 

348 

TOTA. 

244 
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In  general,  those  items  here  selected  for  inclusion  in  the  pilot  test 

QUESTIOMiAIRE  WHICH  LOADED  HIGHEST  UNDER  EACH  OF  THE  AlR  FORCE'S  35 
EMPIRICALLY  DERIVED  FACTORS  USING  PRINCIPAL  COMPONENTS  ANALYSIS.  ALSO, 
MOST  OF  THE  ORIGINAL  ITEMS  UTILIZED  BY  T>€  AlR  FORCE  WERE  MODIFIED  TO 
REFLECT  7>C  DIFFERENCES  IN  ARMY  AND  AlR  FORCE  ENVIRONMENTS  AND 
TERMINOLOGY.  EFFORTS  WERE  ALSO  MADE  TO  ENHANCE  ITB4  CLARITY. 

Of  the  original  348  items  used  by  the  Air  Force,  208  items  were 

RETAINED,  EITHER  IN  THEIR  ORIGINAL  WORDING  OR  IN  MODIFIED  FORM.  A 
TOTAL  OF  26  ITEMS  WERE  ADDED  TO  THE  HYPOTHESIZED  AlR  FORCE  FACTORS 
WHILE  ANOTHER  90  V€RE  INCLUDED  FOR  THE  FOUR  HBrl  FACTORS  USED  IN  THE 
PILOT  TEST  QUESTIONNAIRE.  THE  COMBINED  TOTAL  OF  324  ITEMS  WAS  THEN 
REDUCED  TO  225  BASED  ON  THE  FOLLOWING  CRITERIA:  00  &DUNDUNCY  OF 
items;  (b)  Reducing  the  number  of  items  in  each  of  the  hypothesized 
factors  "Individual",  "Human  Supervision",  and  "Family";  and  (c) 
Constraints  imposed  by  the  length  of  ttc  answer  sheet,  All  items 
WERE  MEASURED  ON  A  SEVEN  POINT  SCALE,  RANGING  FROM  EXTFBtLY  DISSATISFIED 
TO  EXnetLY  SATISFIED. 

The  PILOT  TEST  WAS  ADMINISTERED  BY  THREE  TEAMS  OF  TWO  INDIVIDUALS  EACH 
FROM  MILPEKB  TO  AN  AVAILABILITY  SAMPLE  OF  APPROXIMATELY  1,600  PERSONNEL 
AT  SIX  US  INSTALLATIONS,  In  ADDITION,  ROUGHLY  800  INDIVIDUALS  WERE 
PERSONALLY  INTERVIEWED,  TtCSE  INTERVIEWS  PROVIDED  INSIGHTS  INTO  THE 
CONTENT  VALIDITY  OF  T>€  QUESTIONNAIRE  AND  ALLOWED  SOLDIERS  THE 
OPPORTUNITY  TO  COMENT  ON  THE  WORDING,  LENGTH,  SENSITIVITY  AND  COVERAGE. 
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SATISFACTION  VARIABLES.  SINCE  TTC  SOLDIERS  RESPONDING  TO  TIC  PILOT 
MERE  ASKED  TO  RESPOND  TO  225  ITEMS  (74  ITEMS  IN  EACH  OF  THREE  SECTIONS 
AND  THREE  IN  ANOTHER  SECTION),  IT  WAS  HYPOTHESIZED  THAT  THE  MEAN  VALUE 
OF  EACH  OF  THESE  VARIABLES  MIGHT  VARY  ACCORDING  TO  WHERE  IT  APPEARED 
IN  THE  QUESTIONNAIRE.  To  TEST  THIS  HYPOTHESIS,  THESE  THREE  SECTIONS 
WERE  COUNTERBALANCED  (l.E.,  THE  ORTER  OF  TTC  THREE  SECTIONS  WAS  VARIED 
AT  THE  SIX  INSTALLATIONS.)  ONE  WAY  ANALYSIS  OF  VARIANCE  OF  THE  SIX 
MEAN  VALUES  OBTAINED  FOR  EACH  VARIABLE  INDICATED  STATISTICALLY 
SIGNIFICANT  DIFFERENCES  FOR  ONLY  12  OR  5.4  PERCENT  OF  THE  222  SEPARATE 
tests.  Thus,  there  were  no  substantial  differences  in  terms  of  the 

EFFECT  OF  THE  ORDER  OF  PRESENTATION. 


Wide  Scjvey.  The  225  items  were  reduced  to  a  final  total  of  124  through 

APPLICATION  OF  TVE  FOLLOWING  THREE  PROCEDURES:  (a)  FACTOR  ANALYSIS;  (b) 
STEPWISE  MULTIPLE  REGRESSION  ANALYSIS  AND  DISCRIMINANT  FUNCTION  ANALYSIS; 
AND  (C)  SUBJECTIVE  REVIEW,  PRIMARILY  BASED  ON  THE  PREVIOUSLY  OBTAINED 
RESULTS, 

The  225  variables  were  factor  analyzed  with  a  primary  objective 

OF  IDENTIFYING  THOSE  VARIABLES  WHOSE  LOADINGS  WERE  "SIGNIFICANTLY  HIGH" 
(l.E.,  THOSE  WHOSE  CORRELATIONS  WITH  ANY  OF  THE  ROTATED  FACTORS  WAS  .40 

or  hi ocr).  Tic  225  items  considered  as  independent  variables  were  then 

EXAMINED  THROUGH  STEPWISE  MULTIPLE  REGRESSION  AND  DISCRIMINANT  FUNCTION 


ANALYSIS.  For  bow  these  analyses,  we  dependent  variables  used  to 

GAIN  INTO  WE  MULTI-FACETED  ASPECTS  OF  REENLISTMSNT  BEHAVIOR  AND 
SATISFACTION  COt PRISED  WE  FOLLOWING:  00  REENLISTtCNT  PLANS;  (b)  JOB 
SATISFACTION;  fc)  ARMT  SATISFACTION;  (d)  DESCRIPTION  OF  UNIT  MORALE; 

and  (e)  Description  of  one's  job.  A  total  of  1 21  items  loaded 

SIGNIFICANTLY  ON  AT  LEAST  ONE  FACTOR  AND  WERE  SIGNIFICANT  PREDICTORS 
AND/OR  DISCRIMINATORS  OF  AT  LEAST  ONE  OF  THE  FIVE  CRITERION  MEASURES 

utilized.  Fifteen  items  vcre  added  so  there  would  be  at  least  one  item 

FOR  EACH  OF  THE  55  HYPOTHESIZED  FACTORS.  THE  136  ITEMS  WERE  THEN 
SUBJECTIVELY  REVIEWED  TO  ELIMINATE  DUPLICATION  WIWIN  THE  S Wl  HYPOTHESIZED 
FACTOR.  Also.  SEVERAL  ITEMS  were  ELIMINATED  wich  were  judged  to  be  of 
LITRE  PRACTICAL  VALUE  IN  TERMS  OF  JCB/ArW  CAREER  SATISFACTION.  MORALE. 

AND  RETENTION  (E.G..  "YOUR  OPINION  OF  THE  AfViY  COMPARED  TO  THE  AlR  FORCE*). 

Starting  from  we  136  items  based  on  objective  analysis,  subsequent  and 

SUBJECTIVE  REVIEW  REDUCED  WIS  NUGER  TO  THE  1?^  ITEMS  CONSTITUTING 

Section  B  of  the  comprehensive  Armt-wide  Job  and  Career  Satisfaction 


Implementation  of  a  Model  Adaptive  Testing  System  at  an 
Armed  Forces  Entrance  and  Examination  Stacion 


Malcolm  James  Ree 
Personnel  Research  Division 
Air  Force  Human  Resources  Laboratory 
Brooks  Air  Force  Base,  Texas 

In  a  world  of  increasing  technical  complexity  and  diminishing  resources, 
it  is  the  task  of  the  military  recruiting  agencies  to  obtain  the  most  highly 
qualified  andidates  for  technical  training.  Traditionally,  paper-and-pencil 
multiple  aptitude  test  ba  teries  have  been  administered  to  applicants  of  a 
wide  r< age  of  abilities.  These  tests  have  been  peaked  to  be  most  discrimi¬ 
nating  ever  a  relatively  narrow  range  because  limited  time  precluded  the 
administration  of  enough  items  to  gain  maximal  test  information  over  a 
broad  range  of  mi  ability.  However,  selection  and  classification  decisions 
must  be  made  which  require  discriminations  at  the  80th  percentile.  At  this 
level,  only  limited  information  is  available  fro*  a  peaked  test. 

Adaptive  testing,  particularly  computer-driven  adaptive  testing, 
promises  to  enable  the  gathering  of  test  information  (Lord  and  Novick,  1968, 
eq  2 0.2.7)  at  all  levels  of  ability  with  equal  precision. and  to  increase 
the  predictive  validity  of  our  military  accession  testing.  Furthermore, 
adaptive  testing  promises  to  reduce  the  time  required  to  g*Jin  ability 
estimates  for  applicants  and  thus  possibly  reduce  overall  costs  by  making 
accession  a  one  day  process. 

The  model  adaptive  testing  system  was  Implemented  in  an  Armed  Fore  s 
Entrance  and  Examination  Station  (AFEES)  in  order  to  study  its  feasibility 
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for  use  in  a  military  selection  setting.  At  the  /FEES,  the  testing 
system  must  be  operated  by  individuals  without  any  special  training  in 
computer  hardware  or  software.  The  cysteu  oust  perform  when  needed;  it 
oust  be  operational  for  the  entire  workday,  and  also  accommodate  applicants 
for  military  service  from  very  low  ability  to  very  high  ability.  It  oust 
net  intimidate  or  frighten  the  applicants  or  the  test  administrators. 

Finally,  it  must  provide  valid  and  reliable  measurement. 

Prior  to  the  implementation  of  an  adaptive  testing  system,  many 
decisions  must  be  made,  both  technical  and  administrative.  The  technical 
questions  Include:  who  are  the  subjects,  what  ability  areas  are  to  be 
tested,  what  items  and  item  statistics  are  available,  which  scoring  method 
will  be  used,  which  item  selection  technique  will  be  used,  what  media 
for  question  presentation  will  be  used,  and  how  pictorial  ltams  will  be 
presented. 

There  are  also  many  administrative  questions.  How  can  the  operation 
be  simplified  so  that  low  ability,  careless,  or  inattentive  examinees 
do  not  cause  an  abnormal  ending  of  the  program?  What  impact  will  the 
demonstration  have  on  day  to  day  AFEES  oporations? 

The  San  Antonio,  Texas,  AFEES  was  chosen  as  the  test  site  because  it 
was  close  to  tjie  development  center  at  the  Air  Force  Human  Resources 
Laboratory  (AFHRL)  which  afforded  considerable  opportunity  tor  monitoring 
the  progress  of  the  adaptive  testing  system. 

The  subjects  for  this  demonstration  were  applicants  for  military  enlist¬ 
ment,  and  their  abilities  covered  a  very  broad  range  of  aptitudes.  They 
were  tested  in  three  aptitude  areas  which  comprise  the  Armed  Forces 


Qualification  Test  (AFQT) :  Word  Knowledge  (WK),  Arithmetic  Reasoning  (AR), 
and  Space  Perception  (SP).  The  AFQT  ia  used  for  initial  qualification  for 
military  service.  Other  aptitude  areas  are  usually  measured  only  if  an 
acceptable  score  on  the  AFQT  is  achieved.  The  subjects  were  tested  while 
awaiting  the  results  of  the  AFQT. 

The  items  used  for  this  model  adaptive  testing  system  were  culled 
from  existing  historic  item  files  at  the  AFHRL.  Only  item  difficulty  (P) 
5ud  iten  discrimination  (Phi)  indices  were  available.  Items  were  selected 
to  represent  a  generally  rectangular  distribution  of  difficulties  from 
-itc  it  .2  tc  ^bout  .8  with  the  highest  available  discrimination  index  at 
etch  difficulty  level.  These  items  were  then  assembled  into  booklets  for 
ad«J.nlstratlon  to  Air  Fjjce  basic  recruits  in  order  to  estimate  latent 
trait  parameters  a,  b,  and  c  (Lord  6  Novick,  1968)  for  later  phases  of 
this  demonstration.  Initially,  the  classical  item  indices  were  transformed 
via  approximations  and  used  to  calculate  the  latent  trait  parameters  useful 
tor  the  project.  Although  these  estimates  would  vary  somewhat  from  the 
more  exact  estimates  obtained  from  the  new  response  data,  they  did  permit 
a  reasonable  starting  point  from  which  to  demonstrate  the  feasibility  of 
adaptive  testing  for  applicants  for  military  service.  As  soon  as  a  satis¬ 
factory  sample  has  been  collected  and  the  parameters  estimated,  the 
approximated  parameters  will  be  replaced  by  the  ,.ew,  more  exact  parameter. 

A  medium  sire  computer,  ISM  360/65,  was  available  for  the  demonstra¬ 
tion  in  a  time-shared  mode.  The  APL  programing  language  (Gilman  &  Rose, 
1970)  was  selected  because  it  is  interactive  and  has  extremely  powerful 
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operators.  In  addition,  experience  has  shown  that  APL  leads  to  fast 
development.  It  is  also  fast  in  execution  and  is  particularly  suited 
for  iandling  vectors  and  matrices. 

A  combination  of  Bayesian  item  scoring/ability  estimation  (Owen,  1969) 
and  selection  of  items  by  maximum  information  (Lord  &  Novick,  1968,  eq  20.4.1) 
was  selected  for  ease  of  programming  and  low  computer  core  utilisation. 

Those  are  two  criteria  for  termination  item  administration:  reduction  of 
the  posterior  variance  of  the  ability  estimate  to  a  low  value  (>  .0625> 
rad /or  the  subjects  having  taken  20  items.  This  procedure  is  also  advantageous 
because  it  dees  not  require  a  structured  item  pool  as  would  a  stratified 
adaptive  test  thus  making  implementation  of  the  testing  system  easier. 

A  modified  Tektronix  model  4006-1  Cathode  Roy  Tube  (CRT)  terminal  was 
used  for  the  demonstration.  A  viewing  hood  to  reduce  glare  and  a  keyboard 
cover  to  prohibit  pushing  inappropriate  keys  were  fabricated.  This 
terminal  supported  the  Tektronix  Graphics  Package,  APLgraph  2,  and  was  run 
at  1200  BALI)  in  half  duplex  mode. 

In  order  to  insure  proper  operation  of  the  system,  operatin/j 

instructions  and  operating  safeguards  were  built  in.  Thv  examinee  is  taught 

how  to  use  and  respond  to  the  terminal  before  any  questit  :>  are  presented. 

All  solicitations  for  input  are  for  characters  ('1,'  '2,'  3,'  '4,'  '5'), 

* 

as  opposed  M  numbers,  and  are  checked  to  determine  the  presence  of  alphabetic 
(ABCD,  etc.)  or  special  ch;  racters  ($.'&,  etc.).  If  an  out-of-range  response, 
an  alphabetic  character,  or  a  special  character  is  give?},  the  instructions 
for  responding  are  repeated.  Then  the  screen  is  cleared  and  finally  the 
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question  is  repeated.  Proper  chazacter  input  is  then  converted  to  its 
equivalent  numerical  form  and  processed. 

As  expected,  the  characters  for  the  questions  of  WK  and  AX  arc  kept 
on  an  external  randomly  accessible  file  and  read  in  as  needed.  Screen 
control  characters  are  stored  with  the  literal  characters  which  makes  for 
simplicity  of  operation.  The  last  array  of  the  file,  roughly  equivalent 
to  the  last  record  of  a  FORTRAN  file,  contains  •  4  by  H  (N  «  the  number  of 
items  in  the  die)  matrix  of  the  item  parameters  and  answer  keys.  This 
matrix  is  road  in  and  manipulated  prior  to  the  presentation  of  all  questions. 

Producing  the  pictorial  displays  for  the  SP  items  presented  a  unique 
problem  in  storage  and  drawing.  One  proposed  storage  method  was  to  use  back 
screen  projection  from  a  ranuofi  access  slide  projector.  This  was  discounted 
because  it  allowed  only  about  100  items  to  be  stored,  added  a  mechanical 
component  to  maintain,  and  required  photographic  slides  of  each  item. 
Similarly,  the  idea  of  writing  a  specific  mathematical  function  to  generate 
each  individual  figure  was  dismissed  because  required  extensive  pro¬ 
gramming  tor  each  new  item.  Finally,  a  method  of  display  was  developed 
using  the  sophisticated  graphics  capability  of  APb  which  only  requires 
placing  a  drawing  of  the  SF  icon  on  a  “digitizing  tablet"  and  touching 
varlcus  points* on  the  urawiug.  These  arr  then  transformed  into  a  vector 
for  each  item,  and  the  graphics  package  d)avs  the  many  lines  represented 
by  the  vector  at  a  very  rapid  rate  (sea  Figure  1).  At  1200  BAUD,  the 
figures  almost  flash  onto  the  acr tec:  us  U*-3  vector  is  read  in  from  its  place 
on  the  file.  This  technique  could  be  ns-tended  to  any  item  requiring  draw¬ 
ings,  such  as  mechanical  principles  or  block  counting. 


AU  technology  and  software  developed  to  draw  the  Space  Perception 
items  are  general  enough  to  be  used  in  other  testing  or  educational 
applications.  The  perspective  and  three  dimensional  effect  are  very 
good,  and  motion  for  rotating  or  shifting  the  figures  can  be  added. 

Rotating  figures,  demonstration  of  mechanical  principle,  or  moving  lever 
arms  may  lead  to  new  item  types  not  amsnable  to  static  paper-and-pencil 
tests.  Computer  driven  graphics  may  allow  us  to  measure  new  and  impor¬ 
tant  ability  area.. 

An  operator's  manual,  nlgorithmicly  written,  was  produced  for  the 
AFEES  personnel.  It.  contains  complete  instructions  for  Initial  dally 
starting  and  stopping  of  the  testing  system.  It  also  gives  Instructions 
for  starting  the  program  if  the  terminal  is  already  running.  The  manual 
offers  names  and  telephone  numbers  of  people  to  contact  in  the  event  of 
trouble.  The  programs  have  been  "locked"  to  the  AFEES  staff,  and  they 
have  been  advised  not  to  try  to  edit  the  prograus.  Back-up  copies  of 
both  the  programs  and  the  files  are  stored  on  line  and  require  only  a 
command  from  the  proper  user  to  reinstate  damaged  program*  or  to  update 
programs  as  they  are  refined. 

Data  grade  telephone  lines,  a  special  telephone  number  for  the  AFEES 

use  only,  and  a  special  sign-on  code  were  provided  to  reduce  coiapetition 
« 

for  telephone  ports  in  the  time-shared  environment.  The  "Special  Testing 
Room"  at  the  AFEES  was  used  to  house  the  terminal.  This  is  a  10'  x  12' 
windowless  room  containing  several  student  chairs  with  arms,  one  side 
chair,  and  a  3'  x  2'  table  for  the  terminal.  The  terminal  and  the  telephone 
connector  need  little  space  and  can  be  operated  in  any  room  with  117  volts 
AC  and  o  telephone. 


907 


The  feasibility  of  adaptive  testing  will  be  Investigated  in  this' 
demonstration  by  assessing  two  important  factors.  First,  did  the  system 
run  with  little  trouble  and  attention?  This  will  be  assessed  from  inter¬ 
views  with  the  AFEES  staff  and  from  daily  logs  of  the  system's  operation. 
Secondly,  was  adaptive  testing  aa  valid  aa  paper-and-pencil  testing? 

The  validity  of  the  adaptive  testing  system  will  be  assessed  by  compering 
the  subjects'  adaptive  scores  and  the  aanjectr'  AFQT  subtest  scores. 

Analysis  of  these  data  will  help  in  caking  future  decisions  about 
adaptive  testing. 

Following  this  demonstration  there  will  be  questions  to  answer  before 
any  large  scale  implementation  can  be  undertaken.  Some  of  these  questions 
are  psychometric,  some  logistic,  and  some  economic.  As  yet,  no  testing 
configuration,  local  or  nationwide,  has  been  developed,  nor  have  system 
costs  for  implementing,  operating,  and  supporting  adaptive  testing  been 
established.  Basic  conceptual  questions  dealing  with  such  diverse  topics 
as  tusting  models,  back-up  systems,  operating  policies,  and  central  versus 
dispersed  processing  remain  unanswered. 

Ir.  is  conceivable  that  certain  other  decisions  will,  facilitate  broad 
scale  implementation  of  adaptive  testing.  For  example,  the  AFEES  in  Baltimore 
Maryland,  already  has  computer-automated  management  and  paper  handling  on 
an  in-house  mini-computer.  The  addition  of  adaptive  testing  might  reqt'/>*A 
little  additional  hardware,  and,  in  quantity,  this  additional  hardware  might 
be  inexpensive  enough  to  merit  its  use.  Furthermore,  adaptive  testing  could 
add  to  test  security  because,  neither  test  booklets  nor  answer  key  are 


In  the  future,  the  actual  costs  end  benefits  of  adaptive  testing  will 
be  known.  Ihis  will  penult  realistic  decision  making  for  its  use.  This 
knowledge  will  allow  adaptive  testing  to  move  from  the  fad  of  the  1970's 
to  the  operational  tool  of  the  1980's  and  bey  Mid. 
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AN  ADAPTIVE  TEST  OF  ARITHMETIC  REASONING 
James  R.  McBride 


In  recent  years  there  has  been  grcwlng  Interest  among  test  theoreti¬ 
cians  and  practitioners  In  adaptive ,  or  tailored,  ability  testing  as  an 
alternative  to  group-administered  conventional  tests.  The  reasons  for 
this  interest  have  been  many,  but  a  key  reason  is  the  psychometric  effi¬ 
ciency  of  tailored  tests:  by  tailoring  the  choice  of  test  items  to  the 
Individual  test-taker,  a  short,  veil-designed  adaptive  test  can  match  the 
measurement  precision  of  a  much  longer  conventional  test.  In  theory,  it 
is  possible  for  an  adaptive  test  to  equal  a  conventional  test's  reliability 
(and  validity)  in  less  than  half  Its  length. 

The  theoretical  advantages  of  adaptive  tests  are  not  realized  without 
some  cost,  however.  These  costs  can  be  expressed  in  item  quality  and 
quantity.  Urry  (1970)  demonstrated  that  the  reliability/validity  advantages 
of  adaptive  tests  depended  on  the  availability  of  unusually  highly  discrim¬ 
inating  test  items  --  items  with  discriminating  power  equivalent  to  item- 
trait  biserial  correlations  exceeding  .62.  Lord's  (1970)  theoretical 
analyses  of  adaptive  tests  focused  on  branching  procedures  such  as  the 
stair-step  (Lord,  1974)  or  pyramidal  (Larkin  &  Weiss,  1975)  procedure,  and 
the  Robbins-Munro  procedure.  The  stair-step  method  required  an  item  pool 
containing  *»(k(k+l))  items  in  order  to  administer  an  individualised  k-ltem 
test  to  each  examinee;  thus  a  15-item  pyramidal  test  required  a  120-item 
pool  of  highly  discriminating  test  items.  The  superior  Robblns-Hunro 
tailoring  procedure  required  2^-1  items  in  the  pool  for  an  individual  test 
length  of  k  items;  a  15-item  Robbins-Munro  test  would  require  32767  items 
in  the  pool! 

Other  adaptive  strategies  made  less  exorbitant  but  still  stringent  item 
pool  demands.  Jensema  (1977)  recommended  an  item  pool  size  exceeding  100 
items  in  order  to  implement  Owen's  Bayesian  adaptive  strategy.  Urry  (1974) 
screened  about  900  operational  test  items  in  order  to  assemble  a  200-item 
pool  to  measure  verbal  ability.  In  short,  the  number  and  quality  of  items 
seemingly  required  to  implement  an  adaptive  testing  strategy  raised  serious 
questions  about  the  feasibility  of  using  adaptive  tests  in  settings  where 
item  resources  are  limited.  The  purpose  of  this  paper  i«  to  describe  an 
attempt  to  construct  an  useful  adaptive  test  from  a  limited  number  of 
available  test  items.  The  attempt  was  a  theoretical  one,  but  as  you  will 
see  below  it  was  motivated  by  a  practical  problem,  and  was  based  on  analysis 
of  the  psychometric  properties  of  re~l  test  items.  You  will  also  see  below 
that  the  attempt  was  successful,  i  fact  which  should  have  Important  implica¬ 
tions  for  future  practice. 


911 


BACKGROUND 


The  mental  testing  portion  of  the  military  enliatment  acreening  pro¬ 
cess  consumes  about  three  hours  of  each  examinee's  processing  time.  That 
time  is  used  to  administer  the  Armed  Services  Vocational  Aptitude  Battery 
(ASVAB),  which  consists  of  twelve  cognitive  subtests  and  an  interest  in¬ 
ventory.  If  the  time  required  to  administer  the  cognitive  tests  could  be 
reduced  substantially,  some  of  the  available  three  hours  could  be  used 
profitably  in  other  ways:  to  collect  biographical  data,  for  example,  or 
to  assess  reading  skills.  However,  reduction  of  that  testing  time  would 
require  either  eliminating  some  subtests,  or  shortening  some  or  all  of  them; 
an  alternative  solution,  if  it  were  feasible,  might  be  to  use  adaptive  tests 
of  the  cognitive  abilities.  Part  of  the  feasibility  question  hinges  on  the 
availability  of  sufficient  numbers  of  highly  discriminating  test  items  having 
a  wide  distribution  of  difficulty.  The  purpose  of  the  analyses  reported  be¬ 
low  was  to  assess  the  feasibility  of  constructing  short  adaptive  subtects  of 
ASVAE,  using  available  teems,  without  detriment  to  the  psychometric  quality 
of  the  test  scores. 

The  Arithmetic  Reasoning  (AR)  subtest  was  taken  as  a  case  in  point.  AR 
is  a  2C-ttem  subtest  in  the  currently  operational  ASVAB  Forms  6  and  7,  is 
highly  reliable  for  its  length,  and  considered  singly  is  one  of  the  best 
subtests  in  terms  of  validity  with  external  criteria.  A  target  objective 
of  the  feasibility  study  was  to  determine  whether  an  adaptive  version  of  AR 
could  be  devised  which  would  use  available  test  items,  have  psychometric 
properties  equal  to  an  operational  AR  test,  yet  be  only  half  as  long.  In 
order  to  specify  the  target  objective  rigorously,  an  information  analysis 
(Birnbuum,  1968)  of  the  AR  subtest  of  ASVAB  Form  6  was  performed,  using 
approximations  of  the  parameters  of  each  of  the  AR  subtest's  20  Items' 
characteristic  curves.  The  resulting  test  information  function  is  illustra¬ 
ted  in  Figure  1. 

For  those  of  you  who  are  unfamiliar  with  this  kind  of  analysis,  let  me 
refer  you  to  Blmbaum  (1968)  for  a  detailed  presentation,  but  state  briefly 
here  that  the  test  Information  function  is  an  index  of  the  test's  measure¬ 
ment  precision  as  a  function  of  location  of  the  ability  scale.  The  higher 
the  local  value  of  the  test  information  function,  the  more  useful  is  the 
test  for  discriminating  among  examinees  in  that  rvf.ion  of  the  scale.  The 
test  information  function  is  related  to  the  test  reliability  (Samejima,  1977); 
unlike  the  reliability  coefficient,  however,  the  i-iformation  function  is  in¬ 
variant  from  group  to  group.  One  important  property  of  it  is  its  relation¬ 
ship  to  the  conditional  variance  of  errors  of  measurement.  Asymptotically, 
the  conditional  error  variance  equals  the  Inverse  of  the  information  function; 
hence  (again,  asymptotically)  the  conditional  standard  error  of  measurement 
equals  the  inverse  of  the  square  root  of  the  information  function.  Figure  2 
is  a  graph  of  the  inverse  of  the  square  root  of  the  values  graphed  in  Figure 
1,  and  may  be  interpreted  as  illustrating  ths  measurement  error  characteristics 


9U 


of  the  20-item  operational  AR  test.  It  and  Figure  l  are  standards  against 
which  any  substitute  for  the  20-item  test  may  be  judged. 


THE  ADAPTIVE  TEST 


Constructing  an  adaptive  AR  test  required  identifying  a  source  of  test 
items  to  stock  the  item  pool  and  choosing  a  rationale  for  adaptive  item 
selection.  The  obsolete  Forms  2  and  3  of  ASVAB  were  chosen  as  the  item 
source,  resulting  in  a  pool  of  50  AR  items;  this  is  a  small  item  pool 
relative  to  the  sixes  usually  suggested  for  adaptive  testing.  The  item 
characteristic  curve  parameters  of  each  item  were  approximated  from  avail¬ 
able  item  analysis  data. 

The  adaptive  item  selection  rationale,  or  “strategy"  (Weiss,  1974) 
chosen  was  a  two-stage  variant  of  a  multi-level  test  strategy  similar  to 
that  described  by  Lord  (1977),  This  was  motivated  in  part  by  the  small 
size  of  the  item  pool,  and  in  part  by  the  desire  that  the  test  be  admin¬ 
istrate  in  paper-and -pencil  form  as  well  as  by  computer. 

The  first-stage,  or  "routing"  test,  is  a  three-item  branching  test 
baaed  on  a  seven-item  subset  of  the  50-item  pool.  The  remaining  43  items 
were  used  to  construct  several  seven-item  overlapping  levels  of  a  multi¬ 
level  test.  Each  examinee  answers  questions  at  just  one  level;  the  choice 
of  level  is  based  on  his  performance  on  the  routing  test. 


The  Routing  Test 

Figure  3  is  a  schematic  diagram  of  the  routing  tost.  Every  examinee 
answers  item  1,  which  was  chosen  as  the  item  which  is  locally  optimal  by 
the  least-squares  criterion  proposed  by  Owen  (1969)  for  item  selection  in 
a  sequential  Bayesian  tailored  testing  procedure. 

Before  the  test  begins,  the  only  information  available  about  an 
examinee  is  tho  population  mean  ability,  0.0,  Using  the  Bayesian  procedure 
given  by  Owen  (1969),  we  can  update  that  Information  after  observing  per¬ 
formance  on  item  1.  This  results  in  two  possible  ability  estlamtes:  ,40 
after  a  right  answer,  versus  -.94  for  a  wrong  answer;  associated  with  each 
score  is  an  appraisal  of  its  standard  error.  The  combination  of  the  score 
and  standard  error  data  permit  us  to  choose  two  locally  optimum  items  (one 
for  each  score  on  item  1)  using  the  least-squares  criterion.  Each  examinee 
is  directed  to  the  appropriate  one  of  those  two  items.  The  same  ability 
estimation  and  item  selection  procedure  is  used  after  the  examinee  answers 
the  second  item,  to  route  him  to  one  of  four  optimum  third  items.  After 
the  third  item  is  answered,  the  ability  estimation  ;.t ocedure  results  in 
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just  eight  different  estimates  —  one  ability  eatinate  correaponding  to 
each  of  the  eight  poaaible  patterns  of  item  a core a.  Thus,  a  unique  ability 
eatinate  ia  implied  by  the  pattern  of  the  examinee* a  acorea  on  the  three 
items  comprising  hia  routing  teat;  assignmont  of  the  examinee  to  a  level 
of  the  multi-level  aecond  stage  test  is  based  on  that  implied  ability 
estimate. 

The  routing  teat  may  be  computer-administered,  in  which  case  it  ia 
identical  In  scoring  and  item  selection  to  Owen's  Bayesian  sequential 
tailored  testing  method.  It  will  retain  these  properties  if  paper-and- 
pi'ncl;.  administration  is  used,  and  will  have  several  important  advantages 
over  previously  proponed  paper-and-pencil  branching  tests.  The  primary 
advantage  is  that  the  branching  tajk  is  quite  simple:  from  the  first  item 
to  the  second,  from  the  second  to  the  third,  and  from  the  third  item  to 
one  level  of  the  multi-level  test  (each  level  of  which  may  be  printed  on 
a  separate  page  in  the  test  booklet).  The  second  advantage  is  that  the 
test  is  optimally  scorable  even  if  an  examinee  makes  an  error  in  branching; 
this  Is  so  because  the  test  is  item  response  theory  based;  each  item  in 
the  pool  has  had  its  item  characteristic  curve  parameters  estimated  in 
advance,  so  that  the  teat  may  be  computer-scored  using  maximum  likelihood 
estimation,  with  all  scores  expressed  in  a  common’ metric  regardless  of 
what  particular  set  of  items  the  individual  examinee  has  answered. 


The  ft>lti -Level  Test 

After  seven  items  from  the  50-item  pool  had  been  reserved  for  the 
routing  test,  forty-three  items  remained  from  which  to  construct  the 
different  levels  of  the  multi-level  second-stage  test.  Each  level  re¬ 
quired  seven  items,  but  some  item  overlap  was  considered  desirable  to 
minimise  the  seriousness  of  routing  errors. 

There  could  be  as  many  aa  eight  levels  --  one  for  each  ability  esti¬ 
mate  resulting  from  the  routing  test.  Items  were  assigned  to  levels  using 
the  "cut -and -try”  method;  the  effect  of  each  trial  was  analysed  psycho- 
metrically,  It  was  finally  determined  that  due  to  the  small  item  pool 
there  was  no  benefit  to  using  more  than  six  levels. 

Figure  4  shows  the  allocation  of  test  items  to  the  six  level  tests; 
each  item's  psychometric  characteristics  are  indicated  by  its  location  in 
the  two-dimensional  plane  formed  by  item  characteristic  curve  difficulty 
(horlaontal  axis)  and  discrimination  (vertical  axis).  The  arrows  below 
the  horizontal  axis  indicate  the  scale  values  of  each  of  the  eight 
possible  routing  test  scores.  The  cluster  of  seven  items  constituting 
the  level  test  corresponding  to  each  routing  test  score  is  Indicated  on  the 
figure. 
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Information  Analyses 

The  combination  of  eight  routing  testa,  and  aix  different  level#  of 
the  second-stage  teat,  results  in  *n  adaptive  test  which  administers  one 
of  eight  different  combinations  of  ten  items  to  each  examinee.  The  test 
information  function  of  each  10-itera  combination  was  computed  separately, 
along  with  the  conditional  probability  of  that  combination  occurring  for 
specified  ability  levels.  The  overall  information  function  of  the 
adaptive  test  was  computed  from  that  data,  using  the  formula: 

8 

I  (9)  -  2  Ik  (6)  P  (Vkj0), 

k-1 

where 

I  (0)  "  the  adaptive  test  information  function  value; 

Ik(0)  ■  the  information  function  value  of  combination  k  (l<jk<8); 

P  (vjt|6)  *  the  conditional  probability  of  response  score  pattern 
k  on  the  routing  test. 

Ik(0)  is  calculated  directly  from  the  item  characteristic  curve  parameters 
of  the  10  items  in  a  given  combination  k.  P  (vkl0)  is  calculated  from  the 
item  characteristic  curve  parameters  of  the  routing  test  leading  to  com¬ 
bination  k. 

The  resulting  information  function  is  depicted  in  Figure  5.  The  graph 
of  the  corresponding  conditional  standard  error  curve  is  in  Figure  6, 
overlaid  with  the  counterpart  curve  for  the  20-item  conventional  teat 
(repeated  from  Figure!).  Figure  6  can  be  interpreted  in  the  following 
manner:  The  conventional  AR  test  achieve#  its  lowest  level*  of  measure¬ 
ment  error  in  the  region  of  the  ability  scale  between  100  and  120;  on  the 
Army  Standard  Score  scale  this  is  the  range  between  the  population  mean 
and  one  standard  deviation  above  the  mean.  If  a  conditional  standard  error 
value  of  7.0  is  taken  as  tolerable  (i.e.,  a  standard  error  of  measurement 
corresponding  to  a  reliability  coefficient  of  .88),  the  conventional  test 
has  satisfactory  measurement  properties  from  about  90  (one-half  standard 
deviation  below  the  mean)  to  130  (one  and  one-half  S.D.'s  above  the  mean). 
It  is  not  satisfactory  in  the  range  from  70  to  80,  which  is  perhaps  the 
most  crucial  region  of  the  scale  for  Army  enlistment  screening  purposes. 

The  cur/e  of  measurement  error  for  the  adaptive  test  shows  it  to  be 
satisfactory  throughout  the  range  from  about  75  to  1,36  on  the  Standard 
Score  scale.  It  is  notably  superior  to  the  conventional  test  in  the  criti¬ 
cal  range  from  70  to  80,  which  implies  that  it  should  be  a  better  test  for 
screening  prospective  enlistees  than  is  the  operational  AR  subtest  of 
ASVAB  Form  6.  The  adaptive  test's  measurement  error  characteristics  are 
inferior  to  those  of  the  conventional  test  between  10Q  and  120  on  the  score 


scale;  the  ditferences  are  slight ,  however,  and  the  involved  region  of 
the  scale  is  not  critical  for  nost  Army  screening  or  classification  pur¬ 
poses. 

Another  way  to  compare  the  10-item  adaptive  test  with  the  20-item 
conventional  one  is  by  means  of  a  relative  efficiency  analysis,  the 
relative  efficiency  (RE)  index  is  simply  the  ratio  of  one  test's  informa¬ 
tion  function  to  that  of  the  other  test.  Since  the  adaptive  test  is  here 
being  considered  as  an  alternative  to  the  conventional  one,  its  relative 
efficiency  should  equal  or  exceed  1.0  throughout  the  range  of  interest  on 
the  ability  scale.  Figure  7  is  a  plot  of  the  relative  efficiency  of  the 
adaptive  AR  test,  compared  to  the  conventional  one.  It  tells  the  same 
story  as  Figure  6:  The  adaptive  test  has  better  measurement  properties 
than  the  operational  AR  test  throughout  the  important  range  of  the  stand¬ 
ard  score  scale,  with  the  exception  of  the  region  from  the  mean  to  one 
standard  deviation  above  the  mean;  vhero  it  is  only  slightly  inferior  to 
the  conventional  test. 


CONCLUSIONS 


The  information,  conditional  standard  error,  and  relative  efficiency 
analyses  above  show  that  it  Is  feasible  to  construct  a  10-ltem  adaptive 
test  of  Arithmetic  Reasoning  which  has  overall  measurement  properties  as 
good  as  those  of  a  conventional  test  twice  as  long.  It  should  be  just  as 
feasible  for  the  other  non-speeded  ASVAB  subtests.  By  virtue  of  the 
relationship  between  the  Information  function  end  the  reliability  co¬ 
efficient  pointed  out  by  Samejima  (1977),  the  Information  analyses  re¬ 
ported  above  imply  that,  for  the  military  mobilisation  population,  the 
short  adaptive  test  should  have  reliability  (and  hence  validity)  at  least 
equal  to  chat  of  the  longer,  operational  test. 

The  results  reported  above  have  Important  Implications  for  the  feasi¬ 
bility  of  Implementing  adaptive  testing  In  the  military  selection  setting. 
One  Implication  Is  that  successful  adaptive  counterparts  of  today's 
operational  screening  and  classification  tests  can  be  Implemented,  using 
relatively  small  item  pools.  Since  the  50-ltem  pool  used  for  this  study 
was  composed  solely  of  Items  from  now  obsolete  operational  test  forms, 
a  ftccond  implication  is  that  the  quality  end  distribution  of  test  items 
.-3/iscltutlng  our  operational  tests  are  quite  adequate  for  adaptive  testing 
purposes.  A  third  Implication,  this  one  following  from  the  simplicity  of 
the  adaptive  testing  procedure  which  was  used,  is  that  in  principle  It  is 
feasible  to  develop  adaptive  tests  which  can  be  administered  in  paper-and- 
pencil  form,  without  excessively  complex  instructions,  and  without  the 
need  for  scoring  Intermediate  stages  of  the  test. 
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Figure  3.  A  schematic  of  the  branching  rule  for  the  3-ite»  Bayesian 
adaptive  routing  test. 
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ABSTRACT 


The  objective  of  the  research  effort  described  in  this  paper  is  the 
development  of  a  model  that  will  project  a  training  program  for  new 
weapon  systems  early  in  their  development.  It  allows  for  the  introduc¬ 
tion  of  time  and  dollar  constraints  and  uses  a  basic  unit  of  input 
defined  as  a  task.  The  target  application  is  a  new  concept  of  avionics 
integration.  However,  the  computerized  analytical  model  and  its 
associated  data  bank  form  the  basis  of  a  methodology  for  allowing  the 
potential  training  consequences  of  selected  design  options  for  any  new 
weapon  system  to  be  more  fully  considered  in  the  design  process. 

Quantitative  analysis  of  the  potential  human  resources  require¬ 
ments  of  weapon  systems,  still  within  the  design  process,  has  been 
recently  improved  by  technological  advances  in  the  application  of 
simulation  modeling  techniques.  These  advances  have  not,  however, 
been  extended  to  the  qualitative  aspect  of  system  personnel  require¬ 
ments  to  enable  the  simultaneous  conduct  of  a  similarly  detailed 
training  analysis.  This  is  unfortunate  because  the  analysis  of  training 
impacts  is  essential  during  the  design  process  to  produce  designs 
which  maximize  operational  effectiveness  at  minimum  cost. 

The  methodology  reported  addresses  the  qualitative  aspects  of 
human  resources  requirements  through  the  choice  of  training  options. 
Using  a  technique  for  classifying  learning  requirements  and  requisite 
training  options  as  a  function  of  performance  tasks,  it  bridges  design 
and  training  in  a  way  which  allows  cost  and  operational  constraints  to 
be,  not  only  considered  but  also,  traded-off  vrtth  each  other.  It  pro¬ 
vides  a  capability  to  rapidly  assess  training  requirements  and  select  a 
training  approach  and  program  most  appropriate  within  the  limits 
established  by  a  set  of  user  specifiable  constraining  conditions. 
Examples  of  these  are  cost,  training  time,  student  flow,  maintenance 
policy,  and  planned  use  of  job  performance  aids. 
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Application  of  this  methodology  during  the  early  stages  of  design 
can  provide  a  means  to  relate  operational  training  and  personnel 
constraints  and  design,  early  enough  to  allow  designers  to  incorporate 
results  in  the  key  decisions  of  design  finalization  within  the  systems 
acquisition  process. 

INTRODUCTION 

The  training  model  described  in  this  presentation  is  being 
developed  to  meet  an  immediate  need  for  means  to  assess  the  impact 
of  a  new  concept  of  avionics  integration  on  training.  It  is  also  designed 
to  provide  a  basic  tool  for  examining  the  consequences  of  almost  any 
set  of  circumstances  which  bears  upon  training  needs  and  how  they  are 
to  be  fulfilled.  Although  capable  of  independent  operation,  it  is  a  part 
of  a  life  cycle  cost  (LCC)  moueling  system  being  constructed  within  a 
LCC  study  in  the  Digital  Avionics  Information  System  (DAIS)  advanced 
development  program.  The  overall  objective  of  that  study  is  to  assess 
the  LCC  impact  of  the  DAIS  and  also  to  provide  more  adequate  means 
for  incorporating  LCC  considerations  into  design,  operations,  and 
maintenance  decisions  throughout  the  systems  acquisition  process, 
particularly  in  its  early  stages. 

Although  the  training  model  data  oank  development  is  currently 
specific  to  avionics,  the  model  represents  an  extremely  broad 
approach  to  training  analysis.  Its  primary  contribution  to  training 
technology  is  its  generalizability  and  the  fact  that  it  establishes  an 
increased  degree  of  logic  and  mechanization  in  a  task  which  is  often 
thought  to  be  more  of  an  art  than  a  science.  Rather  than  a  completed 
structure,  it  is  the  framework  of  a  training  evaluation  process  to  be 
built  upon  and  expanded  to  more  adequately  address  specific  needs. 

The  model  allows  a  training  analyst  to  assign  values  to 
variables  describing  systems,  policy,  training  operations,  resources, 
and  cost.  Within  the  bound/;  of  the  user  established  set  of  constraints, 
it  produces  an  estimate  of  the  training  program  requirement  which 
their  interactions  generate.  Results  may  be  refined  by  iteratively 
exercising  the  model  using  different  values  for  constraint  parameters 
and/or  input  data.  Tne  means  to  relate  system /policy/ resources /cost 
input  uata  to  resultant  training  impacts  are  contained  in  the  model. 
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The  rapidity  and  ease  of  their  exercise,  and  the  way  in  which  the 
model  facilitates  their  iterative  implementation,  solves  a  number  of 
the  problems  in  training  impact  analysis.  Among  these  are  the  early 
identification  of  excessive  requirements,  timely  investigation  of 
alternatives,  and  training  cost  estimation. 

The  modeling  approach  to  training  impact  analysis  affords  a 
capability  which  the  training  community  has  been  seeking  for  quite 
some  time.  It  introduces  the  methodology  to  play  a  more  active  role 
in  controlling  the  onset  of  training  programs.  By  allowing  attention  to 
be  focused  on  the  relative  effects  of  input  data  changes  rather  than  on 
the  calculations  involved  in  quantifying  their  interactions,  the 
modeling  concept  also  allows  the  training  analyst  to  pinpoint  causa* 
tional  antecedents  in  design,  policy,  etc. ,  which  could  give  rise  to 
problems  in  the  planning  of  training.  This  information  can  be  trans¬ 
mitted  to  the  designers  of  weapon  systems  and  policy  planners  for 
their  consideration  along  with  other  requirements.  Providing  the 
capability  to  impart  increased  foresight  to  designers  and  planners  is, 
in  fact,  one  of  the  primary  functions  of  the  DAIS  LCC  study  under 
which  the  model  is  being  developed. 

The  DAIS  advanced  development  program  is  an  Air  Force 
Avionics  Laboratory  program  seeking  to  demonstrate  a  solution  to  the 
problems  of  proliferation  und  non- standardization  of  aircraft  avionics. 
It  is  developing  and  testing  a  concept  of  integrated  avionics  as  an 
information  management  system.  This  concept  proposes  that  the 
processing,  multiplex  transfer,  and  display  functions  of  avionics 
subsystems  be  common  and  serve  all  the  other  avionics  functions  on 
an  integrated  basis. 

Historically,  mission  information  requirements  have  been 
established  along  essentially  autonomous  subsystem  areas  such  as 
navigation,  weapon  delivery,  stores  management,  and  flight  control. 
The  resulting  complexity  of  new  system  configurations,  designed  to 
meet  these  requirements  in  the  fashion  in  which  they  were  established, 
has  led  to  ever  increasing  system  support  requirements.  These 
translate  to  increases  in  system  LCC.  This,  and  the  fact  that  nearly 
all  avionics  subsystems  have  trended  toward  digital  methods  which  can 
facilitate  functional  integration  across  subsystems,  were  major 
catalysts  of  the  DAIS  program. 


Clearly,  the  DAIS  concept  possesses  a  potential  to  effect 
significant  changes  in  the  design,  procurement,  operation,  and 
support  of  weapon  systems.  Also  apparent  is  the  fact  that  these 
impacts  can  be  as  variable  as  they  are  numerous,  depending  upon 
such  things  as  degree  or  manner  of  implementation.  The  DAIS  LCC 
study  is  addressing  these  issues  by  expanding  upon  available 
technology  for  identifying  and  quantifying  the  consequences  of  design 
on  system  ownership.  The  DAIS  training  model  and  data  bank  is  but 
one  result  of  a  search  for  means  to  pinpoint  specific  impacts  of  design 
on  individual  components  of  LCC  in  rapid  fashion,  using  conceptual 
level  design  data. 

Together,  the  Air  Force  Human  Resources  Laboratory  and 
Dynamics  Research  Corporation  of  Wilmington,  Massachusetts,  are 
engaged  in  an  effort  to  construct  a  LCC  modeling  system  capable  of 
assessing  the  various  impacts  of  new  weapon  systems  either  singly  or 
in  concert.  One  of  the  components  of  this  system  is  the  DAIS  training 
model.  It  is  a  computerized  analytical  model  which,  in  the  context  of 
the  overall  system,  provides  requisite  information  so  that  DAIS 
training  costs  can  be  computed.  This  presentation  will  indicate  how  it 
can  be  used:  (1)  to  assist  the  training  analyst  in  conducting  trade-off 
studies  to  define  the  most  cost  effective  training  program;  and  (2)  to 
suggest  a  procedure  for  influencing  design  and  support  system 
concepts  based  upon  associated  training  requirements  and  their  cost. 

As  you  all  probably  know,  the  Instructional  System  Development 
process,  defined  in  Air  Force  Manual  50-2,  is  the  foundation  of  the 
Air  Force  approach  to  insuring  cost  effective  instruction.  The 
research  tool  described  here  parallels  that  process  very  closely.  It 
might  even  be  said  that  it  represents  a  meaningful  step  toward  its 
mechanization  in  developing  and  refining  education  and  training  pro¬ 
grams.  The  following  is  an  overview  of  its  operation. 

OPERATION  OF  THE  MODEL 


The  training  model  (Slide  1)  consists  of  three  modules:  a  pre¬ 
processor,  and  two  nnalytical  modules  for  training  plan  and  training 
program  generation. 
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Operation  of  the  model  is  predicated  upon  the  establishment  of  a 
data  bank  containing  the  art  of  tasks  to  be  learned.  Their  level  or 
specificity  is  a  user  defined  variable,  allowing  for  flexibility  of  task 
definition.  Each  task,  however,  is  assigned  five  descriptor  values 
denoting:  frequency,  criticality,  learning  difficulty,  taxonomy,  and 
sequencing. 

The  data  bank  is  inputted  to  the  pre -processor  module  which 
screens  the  total  set  of  tasks,  in  a  series  of  go  no-go  decisions,  to 
select  these  which  require  training.  The  selected  tasks  then  become 
the  subset  of  tasks  that  are  the  training  requirement.  The  selection  is 
based  upon  pre-established  descriptor  value  levels  determined  by  the 
user.  For  example,  a  criteria  of  tasks  of  a  difficulty  level  above  .60 
may  be  used  to  discriminate  between  tasks  on  the  basis  of  that 
parameter.  Thus,  the  user  maintains  control  of  the  decision  process 
by  his  selection  of  decision  criteria,  i.e.,  parameter  combinations 
and  parameter  value  cut-off  points.  The  list  of  tasks  which  the  pre¬ 
processor  determines  to  be  requirements  for  training  retains  its 
associated  set  of  descriptor  values  and  becomes  the  input  data  set  for 
the  first  analytical  module  which  is  the  training  plan  generator. 

At  this  point,  it  is  assumed  that  all  of  the  outputted  tasks  are  to 
be  trained.  The  user  now  has  the  option  of  designating  a  value  for  any 
one  of  three  constraining  conditions:  personnel  required  (number); 
maximum  training  cost  (dollars);  or  maximum  training  time  (months). 
He  need,  however,  only  specify  the  trained  personnel  requirement  to 
operate  the  module  using  internalized  data  and  relationships.  The 
training  plan  generator  then  produces  an  initial  training  plan.  This 
is  a  two  step  process  in  which  a  minimum  cost  school/or.-the-job 
training  (OJT)  mix  is  determined,  followed  by  recommendations 
concerning  appropriate  methods  and  media,  e.  g. ,  lecture, 
simulation,  mockups,  actual  equipment,  etc. 


After  reviewing  the  initial  training  plan,  the  user  may  either 
select  a  different  set  of  decision  criteria  and  exercise  the  training 
plan  generator  module  to  obtain  another  training  plan,  or  continue  on 
to  the  second  analytical  module  to  generate  a  training  program. 
Generally,  the  training  plan  generator  will  be  iterated  several  times 
by  the  user  as  an  investigative /optimization  procedure  prior  to  the 
selection  of  a  training  plan  to  be  examined  in  more  detail. 

The  training  program  generator  uses  the  outputs  of  the  training 
plan  generator,  along  with  either  user  specified  or  standard  model 
values  for  decision  criteria,  to  produce  a  representative  training 
program.  This  consists  of  a  schedule;  number  of  classes  per  pro¬ 
gram;  number  of  instructors,  simulators,  etc.,  per  program;  course 
length;  estimated  cost;  etc.  As  in  the  exercising  of  the  training  plan 
generator,  it  is  expected  that  the  user  will  also  use  the  initial  training 
program  as  the  basis  for  iteration  to  examine  the  effects  of  changing 
the  values  of  the  input  parameters  under  his  control.  We  will  now  go 
into  further  detail  concerning  data  bank  construction  and  the  functions 
and  capabilities  of  each  portion  of  the  model. 

TRAINING  MODEL  DATA  BANK 

The  first  step  in  analyzing  the  training  impact  of  a  new  system 
is  the  establishment  of  a  data  bank  containing  information  for  use  in 
translating  the  equipment  and/or  maintenance  characteristics  of  that 
system  into  the  basic  elements  which  govern  the  establishment  of 
training  plans  and  programs.  Basically,  this  consists  of  a  systems 
maintenance/operations  requirements  analysis  in  terms  of  tasks  and 
their  descriptors,  and  a  subsequent  analysis  of  the  identified  tasks  in 
terms  of  the  behaviors  they  subsume.  The  latter  analysis  is,  in  many 
ways,  analogous  to  the  former.  It  is  conducted  to  achieve  a  more 
refined  description  of  tasks  in  terms  of  parameters  which  can  later  be 
used  to  classify  and  grade  them.  This  classification  and  grading  is  the 
basis  used  in  the  pre-processor  for  decision  making  concerning  which 
tasks  are  to  be  trained  and  later,  in  the  analytical  modeling  compo¬ 
nents,  for  decisions  concerning  training  plan  and  program  definition. 
The  tasks  are  then  grouped  by  career  field  designation.  However,  this 
last  step  is  solely  for  the  purpose  of  data  bank  organization.  It  is 
assumed  that  each  exercise  of  the  model  is  to  be  accomplished  using 
tasks  within  a  single  personnel  category. 


Data  for  the  DAIS  application  was  gathered  solely  in  the  maint¬ 
enance  area,  on  the  basis  of  equipment  comparability  analyses  and 
historical  records.  It  could,  however,  have  been  gathered  on  operator 
tasks  or  have  been  based  on  survey,  interview,  or  time  and  .notion 
study  techniques.  Perhaps  one  of  the  most  critical  aspects  of  data 
bank  development  is  the  selection  of  the  task  descriptors  which  it  will 
contain.  They  must  achieve  a  balance  between  the  degree  of  specificity 
required  to  perform  meaningful  translations  and  the  latitude  of 
applicability  required  to  maintain  propriety  across  a  wide  variety  of 
tasks.  The  descriptors  we  have  selected  are  representative  but,  not 
exhaustive. 

Each  task  description  consists  of  graded  assessments  in  terms 
of  the  task  descriptors  used  to  translate  equipment  parameters  to 
training  parameters.  The  DAIS  training  data  bantc  is  formatted  in  the 
manner  of  a  matrix,  as  shown  in  (Slide  2).  Tasks  are  related  to 
equipment  and,  when  appropriate,  the  maintenance  event  or  opera¬ 
tional  activity  required  to  restore  the  equipment  to  operational 
readiness.  Each  task  is  evaluated  in  terms  of  five  descriptor  para¬ 
meters:  (1)  frequency;  (2)  criticality;  (3)  learning  difficulty;  (4)  tax¬ 
onomy  groupings;  and  (5)  a  parameter  describing  sequenced  tasks  to 
be  trained  as  a  group. 

Frequency  is  a  relative  measui'e  of  how  often  the  task  occurs  or 
must  be  performed.  Task  criticality  and  learning  difficulty  are 
assessed  according  to  Instructional  System  Development  guidelines. 
Criticality  is  determined  on  the  basis  of  two  factors;  whether  the  task 
is  required  under  emergency  conditions;  and  the  consequences  of 
inadequate  performance.  In  the  training  model  this  is  a  simple 
dichotomous  choice  category.  Learning  difficulty  is  assessed  on  the 
basis  of  task  complexity  and  the  knowledge  and  performance  require¬ 
ments  associated  with  it.  We  are  using  a  five  step  range  to  grade  the 
tasks.  In  the  case  where  tasks  can  be  broken  into  behaviors,  these 
are  graded  individually  and  their  scores  aggregated  to  obtain  a 
composite  task  score. 


Slide  2:  TRAINING  MODEL  DATA  BANK 


The  taxonomy  descriptor  parameters  and  the  scales  which 
define  their  level3  were  adapted  from  Bloom's  representation  of 
human  behavior.  Tasks  are  described  in  terms  of  two  classifications: 
cognitive  and  psycho-motor  activity.  The  tasks  are  then  judged  within 
each  category  according  to  the  scalar  level  designates  shown  in 
(Slide  3).  For  example,  the  behavior  of  coding,  within  the  task  of 
computer  programming,  might  be  a  cognitive  level  four  behavior  and 
a  psycho-motor  one  behavior:  whereas  the  behavior  of  soldering, 
within  the  task  of  equipment  reassembly,  might  be  a  cognitive  level 
two  and  a  psycho-motor  level  three. 

It  might  be  well  to  note  at  this  point  that  we  recognize  that  this 
particular  method  of  task  classification  and  grading  is,  by  no  means, 
comprehensive  and  that  the  scaling  is  subjective.  However,  they  are 
practical  starting  points  for  providing  a  common  denominator  for  a 
wide  variety  of  tasks.  This  is  an  essential  ingredient  in  translating 
equipment  and/or  maintenance  characteristics  of  a  system  into  the 
task  performance  criteria  which  determine  a  need  for  training  pro¬ 
grams. 

The  fifth  descriptor  parameter,  the  task  sequencing  or  "nesting" 
factor,  is  relatively  self-explanatory.  It  serves  to  designate  tasks 
which  logically  fall  together,  either  on  the  basis  of  their  performance 
interaction  or  requirements  generated  by  the  actual  provision  of 
training. 

TRAINING  MODEL  PRE- PROCESSOR 


The  pre-processor  contains  a  set  of  selection  algorithms  which 
determine  the  subset  o,‘  tasks  that  become  the  training  requirements. 
The  five  descriptor  parameters  are  the  criteria  for  a  series  of 
decisio,  .liters,  whose  cut-off  levels  are  set  by  the  user.  Their 
function  varies  from  simple  sorting,  as  shown  in  (Slide  4),  to  weighted 
averaging  for  the  calculation  of  a  task  parameter  which  we  call 
intensity.  This  combinatorial  parameter  can  be  used  in  conjunction 
with  an  algorithm  within  the  module  to  more  closely  examine  the 
potential  effects  of  either  operational  or  policy  alternatives  of  the 
system  which  will  provide  the  training.  This  mode  of  operation  allows 
the  user  great  flexibility  in  emphasizing  or  de -emphasizing  any  or  all 
of  the  five  descriptor  parameters. 


In  order  to  assess  the  impact  of  various  criteria  on  the 
establishment  of  training  requirements,  the  user  inputs  his  selection 
of  descriptor  parameter  cut-off  levels  (decision  criteria),  and 
operates  the  pre-processor  to  yield  a  set  of  training  requirements  and 
their  characterizations  in  terms  of  the  descriptor  parameters.  Each 
time  the  training  analyst  exercises  the  pre-processor  with  a  different 
set  of  decision  criteria,  the  result  is  a  different  set  of  training 
requirements.  Thus  he  can  analyze  the  direct  impacts  of  design  and/or 
policy  on  training  requirements. 

TRAINING  PLAN  GENERATOR 

The  training  plaa  generator  allows  the  user  to  analyze  the  direct 
impacts  of  varying  either  a  personnel  quantity,  training  time,  or  cost 
constraint  on  the  establishment  of  a  training  plan.  Its  reared  inputs 
are  the  outputs  of  the  pre-processor  and  specification  by  ue  user  of 
the  number  of  trained  oersonnel  required.  The  opportunity  to  include 
cost  and  time  constraints  and  parameters  which  reflect  Ids  knowledge 
of  such  things  as  training  facilities,  costing  procedures,  personnel 
categorization  within  the  organization,  etc.,  allow  him  to  tailor  the 
input  to  reflect  known  constraints  governing  the  capabilities  of  the 
training  providers  or  conduct  sensitivity  analyses. 

The  output  of  the  training  plan  generator  module  is  a  minimum 
cost  initial  training  plan  tnat  meets  the  inputted  constraints.  It  con¬ 
sists  of  task  groups  identified  by  a  school/OJT  mix  and  appropriate 
training  methods  and  media,  along  'with  cost  and  time  estimates.  This 
is  accomplished  in  a  two  step  process  which  first  determines  an 
optimum  school/OJT  mix  and  then  procedes  tc  select  appropriate 
training  methods  and  media.  Both  stnps  are  accomplished  on  the  basic 
of  the  descriptor  parameter  values  assigned  to  each  task.  To  deter¬ 
mine  the  school/OJT  .nix,  the  module  takes  the  training  requirement 
tasks,  along  with  the  user  input  of  number  of  personnel  required,  and 
examines  their  descriptor  values  in  terms  of  a  number  of  equations 
relating  ccst,  time,  and  type  of  training. 


Standard  values  within  the  module  for  the  descriptor  parameter 
decision  criteria,  as  well  as  for  factors  within  the  equations,  are 
addressable  by  the  user  for  revision.  It  should  also  be  noted  that  an 
assumption  is  made  that  school  and  OJT  are  equally  effective,  at  least 
in  rendering  satisfactory  results  in  terms  of  trainee  proficiency.  It  is 
also  assumed  that  all  training  is  to  be  task  oriented.  Recognizing  that 
these  assumptions  greatly  simplify  the  job  of  the  training  plan 
generator,  we  have  allowed  in  the  module  for  their  revision. 

Also,  while  the  cost  relationship  equations  within  the  module 
appear  sufficient  for  the  school/OJT  decision,  they  are  not  used  in  the 
estimation  of  the  cost  of  the  training  program  which  the  model  will 
generate  as  a  final  product.  Detailed  costing  of  the  training  program 
is  a  separate  effort  which  uses  the  cost  figures  generated  by  the 
training  model  as  a  starting  point. 

At  this  point,  the  module  has  grouped  the  tasks  to  be  trained  ca 
the  basis  of  the  sequencing  descriptor  paremeter,  and  designated  them 
by  training  type.  In  the  second  step  of  the  training  plan  process,  they 
are  again  sorted  by  means  of  an  algorithm  which  is  essentially  a  task/ 
training  objective  comparator.  It  first  maps  a  training  objective 
profile  for  each  t.ssk  group,  on  the  baais  of  the  taxonomy  descriptor 
parameter  values  for  the  tasks  within  each  group,  (Slide  5),  and 
compares  the  results  with  similar  profiles  established  for  various 
training  methods  and  media.  The  profile  of  training  objectives  is  the 
common  denominator  for  tasks  ac-i  training  methods  and  media. 

Those  established  for  the  training  methods  and  media  are  based  on 
criteria  established  by  Parker  and  Down  in  1961  (Slide  6).  Once  the 
task  groups  have  been  associated  with  specific  training  methods  and 
media ,  an  output  is  generated  which  provides  a  training  plan  broken 
out  by  task  group,  by  school/OJT  designation,  and  by  methods /media 
recommendation.  Cost  estimates  and  time  requirements  are  also 
included  (Slide  7). 

The  structure  of  the  training  plan  generator  allows  the  user 
considerable  discretion  in  its  control.  In  similar  fashion  to  his  control 
over  the  training  type  decision  process,  he  ca3  alter  the  rules  for  the 
process  which  associates  tasks  with  training  objectives  or  simply 
specify  a  particular  mapping.  He  can  also  alter  the  rules  for  the  pro* 
cess  which  associates  training  objectives  with  training  methods  and 
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media.  These  capabilities  are  available  as  input  selections  and  do  not 
require  any  re-programming.  It  should  be  noted  that,  while  the  train¬ 
ing  method* /media  selection  mechanism  doss  not  break  down 
recommendations  to  specific  implements  nor  deal  'with  specific 
numbers  of  instructors,  simulator  types,  etc. ,  It  is  capable  of  doing 
so.  The  obstacle  to  this  finer  grained  analysis  is  a  lack  of  data  at  the 
present  time,  not  a  programming  limitation. 

It  should  be  clear  at  this  time  thst  each  successive  phase  of  the 
model’s  exercise  allows  the  user  to  investigate  the  impacts  of  results 
that  have  preceded  it.  By  inputting  precise  data,  as  it  becomes  known, 
the  user  can  generate  very  precise  impact  estimates.  By  using  the 
standard  relationships  within  the  model  itself,  he  can  also  obtain 
relative  impact  estimates  of  great  value  early  in  the  system  design 
process.  Training  program  generation  is  the  last  phase  of  modeling 
activity.  It  provides  information  needed  to  calculate  a  cost  estimate 
for  the  training  plan  selected  andfor  optimize  resource  consumption. 

TRAILING  PROGRAM  GENERATOR 

The  training  program  generator  module  takes  the  outputs  of  the 
training  plan  generator  and  produces  a  training  program  on  the  basis 
of  internalized  rules  of  resource  management.  The  training  program 
consists  of  a  schedule,  number  of  students  per  program,  number  of 
instructors,  course  length,  etc.  The  user  specifies  the  required 
number  of  trained  personnel  needed  per  year,  minimum /maximum 
class  size,  resources  available  and  cost,  and  a  ranking  of  both 
resources  and  training  program  objectives  according  to  their  relative 
importance.  For  resources,  this  ranking  is  usually  based  on  avail¬ 
ability  and/or  cost.  Training  program  objectives  may  be  ranked  on 
almost  any  basis  such  as  safety,  mission,  or  performance  require¬ 
ments.  In  the  event  that  the  user  chooaes  not  to  specify  a  rarddng,  the 
algorithm  within  the  training  program  generator  assumes  that 
simulators  are  the  high  cost  drivers  and  that  instructors  are  the 
scarcest  resource.  Given  the  training  plan,  the  ranking  of  resource* 
and  training  objectives,  and  the  personnel  requirements,  etc. ,  the 
algorithm  undertakss  an  optimization  routine  to  yield  a  reasonable 
first  cut  at  a  training  program  of  sufficient  detail  to  be  costed 
(Slide  8). 
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Results  may  be  iterated  to  determine  various  sensitivities. 

Doing  so  may  reveal  incrdinacies  in  resource  consumption,  cost,  etc. , 
which  might  be  avoide  J  oy  changes  upstream  closer  to  the  equipment 
design  end  of  the  training  analysis  procedure.  The  capability  for 
iteration,  within  and  across  the  components  of  the  training  model, 
using  different  sets  of  criteria  is,  in  fact,  its  strongest  feature. 

SUMMARY 

In  summation  (Slide  9),  !  would  like  to  re-emphasize  that  the 
training  model  presented  today  should  be  thought  of  as  the  first  of  a 
series  to  be  modified  and  refined  in  the  future.  Perhaps  its  most 
significant  aspect  is  its  potential  to  be  useful  in  an  extremely 
diversified  array  of  applications.  This  potential  is  based  upon  its 
general!  zable  structure.  This  is  best  summed  up  by  saying  that  it 
provides  a  means  to  standardize  training  impact  analysis. 

An  extensive  repertoire  of  training  technology  exists  which 
supports  the  design  of  training  systems.  The  training  model  provides 
a  means  to  facilitate  its  concerted  and  timely  application.  Decisions 
concerning  the  establishment  of  training  plans  and  programs  are 
becoming  more  and  more  difficult  due  to  the  ever  increasing  number 
of  variables  which  training  analysts  must  consider.  Many  of  these  are 
incidental  to  training  objectives.  I  refer  to  such  things  as  cost,  lead 
time,  and  other  variables  which  are  primarily  external  to  the  analysis 
of  training. 

Despite  a  good  technology  base,  the  increasing  number  of 
variables  and  the  tracking  of  their  interactions  tend  to  preclude 
comprehensive  analysis  by  the  sheer  complexity  of  their  calculation. 
This  situation  i3  farther  exacerbated  by  the  narrowness  of  the  time- 
frame  in  which  the  results  of  training  analyses  may  provide  useful 
feedback  to  designers  and  planners.  It  is  also  assuming  increased 
importance  as  planners  become  more  attentive  to  the  life  cycle  cost 
aspect  of  systems  acquisition. 
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Training  is  expensive.  Its  expense  reaches  far  beyond  the  cost 
of  producing  trained  personnel.  The  real  cost  of  training  includes 
penalties  paid  in  terms  of  lost  opportunities.  These  are  the  costs 
associated  with  failures  to  capitalize  on  numerous  potentials  for  cost 
avoidance,  due  to  an  inability  to  extend  the  analysis  of  training  re¬ 
quirements  beyond  its  present  role  of  reacting  to  given  sets  of 
conditions.  Clearly,  it  would  be  advantageous  for  training  analysis  to 
change  from  a  post  hoc  activity  to  become  an  integral  part  of  the 
weapon  system  design  process.  This  requires  a  capability  to  take  part 
in  decisions  concerning  aspects  of  design  and  policy  which  create 
training  requirements.  The  fulfillment  of  training  requirements  is 
only  half  the  battle. 

The  modeling  approach  to  training  impact  analysis  can  make 
this  change  possible.  It  can  increase  the  speed  and  systematization  of 
the  procedures  entailed  in  training  planning  and  resource  management. 
Quite  apart  from  its  potential  to  aid  designers  in  developing  more 
maintainable  and  cost  effective  systems,  its  versatility  makes  it  ideal 
for  even  the  most  mundane  problems  concerning  the  provision  of 
training  and  resource  management. 

The  training  model  described  is  a  first  step  in  defining  a 
methodology  for  the  practical  application  of  the  modeling  approach. 
What  remains  is  for  the  training  community  to  continue  its  develop¬ 
ment  in  terms  of  data  and  criteria.  The  model  itself  stands  alone  as 
a  mechanism  capable  of  performing  many  ot  the  required  data 
manipulations  entailed  in  training  impact  analysis,  allowing  the  user 
to  quickly  tradeoff  alternatives. 


REFERENCES 


1.  Czuchry,  Andrew  J. ;  Engel,  Herbert  E. ;  Dowd,  Richard  A. ; 
Baran,  H.  Anthony;  Dieterly,  Major  Duncan  L. ;  and  Greene, 
Ron.  Mid-  1980s  Digital  Avionica  Information  System 
Conceptual  Design  Configuration,  AFHRL-TR- 76-59, 
Ab-Ao§5lS7.  Wright-Patterson  AFB,  OH;  Advanced  Systems 
Division,  Air  Force  Human  Resource  Laboratory,  May  1976. 


2.  Engel,  Herbert  E. ;  Glasier,  John  M. ;  Dowd,  Richard  A. ; 

Bristol,  Marjorie  A. ;  Baran,  H.  Anthony;  and  Dieterly,  Major 
Duncan  L.  Digital  Avionics  Information  System  (DAIS):  Current 
Maintenance* Task  Analysis.  *TFHRL“f  r^tS-TIT  XiT-XbTSSBfr  * 
Wright-Patterson  AFB,  OH:  Advanced  Systems  Division,  Air 
Force  Human  Resources  Laboratory,  October  1976. 


3.  Czuchry,  Andrew  J. ;  Engel,  Herbert  E. ;  Bristol,  Marjorie  A. ; 
Glasier,  John  M. ;  Baran,  H.  Anthony;  Dieterly,  Major 
Duncan  L.  Digital  Avionics  Information  System  (DAIS):  Mid- 
1980s  Maintenance  Task  Analysis.  AFHRL-TR-77-45.  Wright- 
Patterson  AFbi  OH;  Advanced  Systems  Division,  Air  Force 
Human  Resources  Laboratory,  July  1977. 

4.  Parker,  James  F. ;  and  Downs,  J.  E.  Selection  of  Training 
Media.  ASD-TR-6 1-473.  Wright-Patterson  AFB,  OH; 
Aeronautical  Systems  Division,  Aerospace  Medical 
Laboratory,  Behavioral  Sciences  Laboratory,  September  1961. 


5.  Bloom,  B.S. ,  Taxonomy  of  Educational  Objectives; 

Classification  of  Education  Goals,  Cognitive  and  Affective 
Domains.  McKay  Publishers,  IrdAve.,  N.Y. ,  N.Y.,  1§56. 

Department  of  the  Air  Force.  Instructional  System 
Development.  Air  Force  Manual,  AF  Manual  50-i,  July  1975, 


6. 


CONCERNING  THE  AUTHORS 

Mr.  Baran  is  the  DAIS  Life  Cycle  Costing  Study  Manager.  He  is 
a  civilian  employee  of  the  Advanced  Systems  Division  of  the  Air 
Force  Human  Resources  Laboratory,  Wright-Patterson  Air  Force 
Base,  Ohio.  Major  Dieterly,  of  the  same  organization,  is  a  Deputy 
Program  Manager  (Human  Resources)  of  the  DAIS  advanced  develop¬ 
ment  program.  Dr.  Czuchry  is  the  director  of  the  Advanced  Syjisms 
Department  of  Dynamics  Research  Corporation,  Wilmington,  Mass. , 
the  contractor  for  the  DAIS  Life  Cycle  Costing  Study. 


Analysis  of  Flight  Clothing  Effects  on 
Aircrew  Station  Geometry 


Lt  Cmder  Harvey  G.  Gregolre 


949 


TM  77-1  SY 


INTRODUCTION 

BACKGROUND 

1.  Typically  aircrew  station  geometry  requirements  have  been  based  on  nude 
male  anthropometric  data  taken  from  measurements  on  a  standard  anthropometric 
chair,  a  flat  seat  with  a  90  deg  perpendicular  back  surface.  Since  aircrew  persons 
do  not  fly  nude,  nor  do  they  sit  on  a  flat  surface  with  a  90  deg  perpendicular  back, 
nor  are  they  all  male  anymore,  it  is  aecessar;  to  quantify  the  effect  of  those  items 
worn  in  the  aircrew  station  environment,  'flu  necessity  to  quantify  the  effects  of 
personal  flight  clothing  and  equipment  is  particularly  important  in  presently 
developing  tactical  aircraft  since  the  anticipated  higher  g  operational  environments 
are  more  restrictive  to  anthropometric  mobility  than  earlier  models  of  tactical 
aircraft.  Additionally,  the  primary  flight  instrument  status  of  Heads-Up  Displays 
and  similar  electro-optical  devices  may  limit  the  design  eye  reference  of  the  pilot's 
eye  position  to  a  greater  degree  than  other  similar  aircraft  models. 

2.  Many  of  the  prior  research  efforts  In  the  area  of  quantifyng  the  effects  of 
flight  clothing  relative  to  anthropometric  accommodation  have  generally  been  item 
specific;  i.e.,  the  effects  of  wearing  a  pressure  suit  or  a  helmet,  etc.  There  has 
been  little  research,  if  any,  on  the  anthropometric  effects  of  an  entire  complement 
of  flight  clothing  and  equipment. 

3>  Military  Standard  1472B,  the  Human  Engineering  Design  Criteria  for  Military 
Systems,  Equipment,  and  Facilities,  specifies  that  suitable  allowances  must  be 
made  for  the  design-critical  dimensions  imposed  by  protective  clothing  or 
equipment.  Providing  "suitable  allowances"  for  an  unknown  quantity  can  be 
difficult  at  best,  if  not  impossible.  The  failure  to  use  data  concerning  the  effect  of 
flight  clothing  and  equipment  on  anthropometry  in  the  design  of  aircrew  stations 
has  historically  been  costly  in  terms  of  aircrew  safety,  efficiency,  mobility,  and 
comfort. 

4.  The  specific  goal  of  this  analysis  was  to  provide  data  to  quantify  and  describe 
the  effect  of  increased  bulk  and  decreased  mobility  resulting  from  the  wearing  of 
summer  and  winter  flight  clothing  and  equipment  in  a  typical  ejection  seat 
environment. 

5.  The  data  derived  from  this  evaluation  can  be  used  in  the  following 
applications:  (1)  ax  correction  constants  to  be  applied  to  current  computer  based 
eisrdatlon  models  which  have  as  their  goal  the  early  (blueprint)  detection  of 
inconsistencies  between  planned  cockpit  geometry  and  anthropometric  characteris¬ 
tics  of  the  intended  uasr  population,  (2)  as  a  design  aid  to  engineers  tasked  with 
providing  the  anthropometric  accommodation  in  aircrew  stations  specified  by 
military  standards,  and  (3)  cs  a  reference  aid  to  those  organisations  tasked  with 
developing  aircrew  clothing  and  equipment  possessing  the  minimum  bulk,  weight, 
and  mobility  restriction  commensurate  with  the  necessary  protective  characteris¬ 
tics. 
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DESCRIPTION  OF  TEST  FACILITY 

6*  Comparative  anthropometric  measurements  of  subjects  in  unclad)  summer 
flight  gear  and  winter  flight  gear  configurations  were  made  using  a  Navy 
64A105H1-1  Integrated  Measuring  Anthropometric  Device  and  a  standard  medical 
weight  scale. 

7*  The  cockpit  specific  anthropometric  range  of  motion  measurements  was  made 
in  a  Douglas  ESCAPAC  IF-3  ejection  seat  and  restraint  system  modified  with 
adjustable  polnt-of-reference  protractors  positioned  at  range-of-motion  joints  (i.e., 
neck)  clavicle)  elbow)  wrist)  lumbar)  bipt  and  ankle  areas).  The  ejection  seat 
selected  was  typical  of  lap  belt  and  inertia-reel  torso  restraint  systems  found  in 
ejection-seat  equipped  tactical  aircraft. 

8.  This  evaluation  investigated  the  flight  clothing  and  equipment  effects  on 
volume  and  mobility  for  a  sample  of  aircrewman  representative  of  the  entire 
spectrum  of  Naval  aviator  body  sixes.  The  1964  Anthropometry  of  Navy  Aviators 
Survey)  which  listed  body  size  data  for  96  measurements  of  1)549  aviators)  was 
used  for  anthropometric  percentile-rank  criterion  of  the  measurements  evaluated 
except  for  buttock-leg  dimensions.  A  1976  data  sample  compiled  on  anthropometric 
variables  for  969  aviators  was  used  to  define  the  buttock-leg  percentile-rank 
criterion  for  this  evaluation. 

9.  The  anthropometric  dimensions)  joints)  and  respective  range-of-motion  meas¬ 
urements  included! 

a.  Dimensions. 

(1)  Weight. 

(2)  Stature. 

(3)  Standing  waist  height. 

(4)  Functional  arm  reach. 

(5)  Shoulder-elbow  length. 

(6)  Forearm-hand  length. 

(7)  Hand  length. 

(8)  Standing  hip  breadth. 

(9)  Sitting  height. 

(10)  Bideltoid  diameter. 

(1 1)  Buttock-knee  length. 

(12)  Sitting  hip  bieadth. 
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(13)  Popliteal  height. 

(14)  Buttock-leg  length. 

(15)  Foot  length. 

b*  Joints  and  respective  ranges  of  motion. 

(1)  Neck  -  head/look  angle. 

(a)  Elevation. 

(b)  Declination. 

(c)  Aximuth  right. 

(d)  Aximuth  left. 

(Z)  Clavicle/humeral  -  extended  arm  movement. 

(a)  Elevation. 

(b)  Declination. 

(c)  Aximuth  right. 

(d)  Aximuth  left. 

(3)  Elbow  -  lower  arm  movement  (measured  with  upper  arm  extended 
l> or jzon tally  and  vertically  from  clavicle  joint). 

(a)  Elevation. 

(b)  Declination. 

(c)  Aximuth. 

(4)  Wrist  -  extended  hand  movement. 

(a)  Elevation. 

(b)  Declination. 

(c)  Aximuth  right. 

(d)  Aximuth  left. 

(5)  Lumbar  -  ♦orso  movement,  sitting. 

(a)  Declination. 
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(b)  Torsion  right. 

(c)  Torsion  left. 

(6)  Hip  -  upper  leg  movement,  sitting. 

(a)  Elevation. 

(b)  Azimuth  right. 

(c)  Azimuth  loft. 

(7)  Knee  -  tibial  movement. 

(a)  Elevation. 

(b)  Declination. 

(8)  Ankle  -  foot  movement. 

(a)  Elevation. 

(b)  Declination. 

(c)  Azimuth  right. 

(d)  Azimuth  left. 

10.  The  parameters  for  both  series  of  anthropometric  dimensions  and  angular 
range-of-motion  joints  were  selected  from  a  crew  station  assessment  of  reach 
computer  based  simulation  model.  Over  2,300  measurements  wete  taken  for  this 
evaluation. 

11.  The  scope  of  the  flight  clothing  and  equipment  evaluated  included  those 
current  inventory  items  typically  worn  by  those  Navy  crewmen  who  fly  tactical  and 
training  aircraft  equipped  with  ejection  seats. 

12.  With  the  exception  of  those  data  directly  affected  by  the  torso  harness  and 
ejection  seat  restraint  systems,  other  data  can  be  applicable  to  nonejection  seat 
aircraft. 

METHOD  OF  TESTS 

13.  The  subject  crewmen  were  measured  in  thi.te  separate  cuufigvrations: 
(1)  unclad,  (2)  dressed  and  equipped  for  summer  flight,  and  (3)  dressed  and  equipped 
for  winter  flight.  Each  dimensional  and  angular  measurement  was  made  four  times 
and  averaged  to  reduce  measurement  error  variability.  The  quantification 
procedures  are  listed  below: 

a.  The  subject  was  weighed. 
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b.  Cockpit  specific  anthropometric  measures  were  made  using  the  Navy 
64A105H1-1  Integrated  Anthropometric  Measuring  Device.  Data  were 
recorded  on  an  anthropometric  data  form. 


Ic.  The  subject  was  seated  in  the  ejection  seat.  Specially  mounted  transparent 

protractors  were  then  adjusted  horizontally  or  vertically  with  the  protrac¬ 
tor  center  of  radius  point  aligned  with  the  estimated  locus  of  the  joint 
I  center  of  mass.  The  protractor  zero  deg  reference  line  was  then  adjusted 

vertically  and  horizontally  forward  from  the  subject's  respective  joint.  The 
\  subject  then  moved  his  joint  segment  (e.g.,  arm  around  clavicle  joint)  to  a 

point  of  maximum  possible  elevation,  declination,  or  azimuth.  The 
experimenter  aligned  an  index  marker  line  which  originated  in  the 
protractor  center  of  radius  with  the  estimated  n&lline  of  the  respective 
segment  and  read  the  degrees  of  rotation  from  zero  deg  as  indicated  on  the 
protractor  by  the  index  marker  line.  The  maximum  angles  of  motion  about 
joints  were  recorded  on  a  second  anthropometric  data  form. 

d.  Additionally,  while  secured  to  the  ejection  seat  lap  belt  and  inertia-reel 
torso  restraint  system,  each  subject's  reach  distance  was  measured  relative 
to  three  specified  "reach  zones."  Zone  1  defines  the  subject  relaxed  in  a 
locked  harness  reaching  to  controls  without  straining  against  the  harness. 
In  Zone  1,  the  lumbar,  thoracic,  interclavicular,  and  clavicular  segments  do 
not  move.  In  Zone  2,  the  subject  strains  against  the  locked  harness  to 
obtain  maximum  reach.  The  lumbar,  thoracic,  and  interclavicular  segments 
do  not  move  except  for  the  stretch  in  torso  restraint  system.  The  clavicular 
1  segment  does  move  since  it  is  not  securely  held  by  the  torso  harness  and 

restraint  system.  In  Zone  3,  the  shoulder  harness  is  unlocked  and  the 
subject  is  free  to  lean  forward  or  to  the  side  to  obtain  maximum  reach 
within  the  limits  of  shoulder  harness  strap  length.  The  lumbar  and  thoracic 
segments  move  within  the  limits  of  shoulder  harness  strap  length.  The 
reach  distances  were  measured  from  the  thumb  and  forefinger  grasp  to  a 
point  at  the  intersection  of  the  seat  back  surface  and  top  surface  midpoint 
of  the  subject's  shoulder. 
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RESULTS  AND  DISCUSSION 


14.  The  subjects  used  in  this  evaluation  were  seven  m~les,  carefully  selected  to 
represent  the  range  of  anthropometric  characteristics  found  L»  the  Naval  aviation 
population.  Subjects  representative  of  5th,  25th,  50th,  75th,  and  99th  percentile 
population  members  relative  to  stature  and  weight  were  selected.  For  subjects  1 
through  5,  each  of  the  16  anthropometric  variables  was  screened  to  be  within  one 
standard  deviation  of  the  population  percentile  equivalent  being  represented. 

15.  The  primary  purpose  of  the  evaluation  was  to  quantify  the  added  bulk, 
displacement  of  posture,  and  restriction  of  mobility  which  results  from  the  average 
effects  of  flight  clothing  and  equipment.  Therefore,  population-wide  representative 
sampling  of  pertinent  anthropometric  parameters  was  employed.  The  data  are, 
therefore,  presented  ss  plus  or  minus  correction  factors  relative  to  the  dimensional 
and  angle  of  motion  differences  quantified  between  unclad  and  summer  gear  and 
between  unclad  and  winter  gear  configurations.  The  average  increased  bulk 
anthropometric  dimensional  correction  factor  data  are  presented  in  appendix  A* 

16.  For  angular  quantification,  a  forward-facing  seated  posture  was  assumed  by 
the  subjects.  All  joint  measurements  were  made  on  the  right  side  of  the  body;  left 
side  mirror-image  reciprocals  were  assumed.  Vertical  measurements  were  made 
from  a  line  extending  90  deg  to  the  right  of  the  joint  at  sero  deg  elevation.  AU 
horisontal  measurements  were  from  a  line  extending  forward  of  the  joint  at 
sero  deg  asimuth.  The  angular  quantifications  of  average  decreased  mobility 
resulting  from  summer  and  winter  flight  gear  with  locked  torso  restraint  systems 
are  presented  in  appendix  B.  Appendix  C  presents  reach  data  cs  a  function  of  reach 
sone  and  flight  gear  worn. 
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17.  A  maximum  effort  redesign  of  the  complete  flight  clothing  and  equipment 
system  is  necessary  to  reduce  the  bulk  and  weight  effects  of  such  clothing  and 
equipment  on  mobility  within  an  aircrew  station. 


IS.  When  designing  crew  station  geometry  and  locating  controls  and  dijplays, 
designers  should  incorporate  the  maximum  available  data  describing  reduction  in 
anthropometric  mobility  and  increase  in  anthropometric  volume  resulting  from 
flight  clothing  and  equipment  worn  on  the  body. 

19>  The  following  comments  are  relative  to  bulk  and  mobility  restrictions  per  item 
or  per  group  of  items  comprising  the  flight  clothing  and  equipment. 


> 


a.  Helmet  (APH  6*3)/Oxygen  Musk  (A13-A)  -  Five  and  one-half  lb  (2.5  kg); 
weighty  bulky  and  oxygen  h.',se/regulator  "drag*  compromise  vertical  and 
horizontal  head  motion  and  look  angle.  The  anti -exposure  suit  hampers 
horizontal  mobility  less  than  it  does  vertical  mobility. 

b.  Flying  coveralls  (CSFRP-l)y  gloves  (GSlFRP-l)y  torso  harness  (MA-2)  -  Six 
and  six-tenths  lb  (3.0  kg);  weight  and  bulk  not  oppressive.  When  secured  to 
lan  belt  and  shoulder  restraint,  mobility  is  naturally  restricted.  However, 
redesign  of  the  lap  belt  to  an  inertia  system  such  as  the  shoulder  restraints 
and  increasing  shoulder  inertia-reel  strap  length  would  ease  mobility  in 
Zone  3  conditions.  The  flight  gloves  were  the  least  bulky  and  least 
restrictive  item  of  wear. 

c.  Anti-G  coveralls  (MK-2A)  -  Two  and  two-tenths  lb  (1.0  kg);  slightly 
restrictive  due  to  necessary  tight  fit.  As  a  result  of  interviewing 
operational  pilots,  it  was  determined  that  this  item  was  generally  not 
accepted  to  wear  in  conjunction  with  CWU-33P  anti-exposure  suit. 

d.  Survival  vest  (SV-2A)  -  Two  and  four-tenths  lb  (1.1  kg);  weight  and  bulk 
interfere  with  torso  and  arm  movements. 

e.  Boots  (B  21408)  -  Four  and  five-tenths  lb  (2.0  kg);  slight  mobility  restriction 
due  to  weight  and  length  of  vertical  dimension. 

f.  Life  preserver  (LPA-2)  -  Four  and  five-tenths  lb  (2.0  kg);  displaces  posture 
slightly  due  to  packaging.  Occasional  interference  with  inertia-reel 
shoulder  straps. 

g.  Anti-exposure  suit  (CWU-33P)  -  Six  lb  (2.7  kg);  this  was  by  far  the  bulkiest, 
most  restrictive  item  of  equipment.  The  anti-exposure  suit  significantly 
reduced  angle  of  motion  in  the  arms,  legs,  and  torso.  The  bulk  was 
restrictive  not  only  about  the  shoulders,  elbows,  and  knees,  but  increased 
the  effective  retention  of  the  torso  system  regardless  of  harness  locked  or 
unlocked  condition.  Reach  to  cross-cockpit,  vertical,  and  side-console  areas 
w  .s  considerably  hampered,  if  not  prevented,  by  the  anti-exposure  suit. 
Some  subjects  had  difficulty  reaching  the  overhead  face-curtain  ejection 
handle  as  a  result  of  the  anti-exposure  suit  bulk  and  mobility  restrictions. 
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hu  The  total  weight  of  either  summer  or  winter  gear  was  subjectively 
identified  as  one  of  the  more  objectional  factors  of  the  flight  clothing  and 
equipment  by  each  of  the  subjects  as  well  as  numerous  aircrewmen 
interviewed  during  the  project. 

i.  All  aircrewmen  involved  in  the  project  expressed  the  need  for  an  all- 
encompassing  integrated  redesign  of  the  entire  package  of  personal  flight 
equipment  which  would  reduce  weight  and  increase  mobility. 
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SUMMARY  TABLE  OF  AVERAGE  FLIGHT 
CLOTHING/EQUIPMENT  DIMENSIONAL  CORRECTION  FACTORS 


Anthropometric  Measurements 

Mean  Differences 
Between  Nude 
Dimensions  and 
Summer  Flight  Gear 

Mean  Differences 
Between  Nude 
Dimensions  and 
Winter  Flight  Gear 

1.  Weight 

4-28.3  lb  (+12.8  kg) 

+32.0  lb  (+14.5  kg) 

2.  Stature 

+3.2  In.  (+8.1  cm) 

+3.2  in.  (+8.1  cm) 

3.  Waist  height 

+1.2  in.  (+3.1  cm)w 

+1.2  in.  (+3.1  cm) 

4.  Arm  reach  ^ 

+  .3  in.  (+.8  cm) 

+.5  in.  (+1.3  cm) 

5.  Shoulder-elbow  length 

+.1  in.  (+.3  cut) 

+.6  in.  (+1.5  cm) 

6.  Forearm -hand  length 

+  .1  in.  (+.3  cm) 

+.3  in.  (+.8  cm) 

7 .  Hand  length 

0 

0 

8.  Hip  breadth,  standing 

+1.1  in.  (+2.8  cm) 

+1.5  in.  (+3.8  cm) 

9.  Sitting  Height (Z) 

+2.2  in.  (+5.6  cm) 

-<2.5  in.  (+6.2  cm) 

10.  Eye  height,  sitting 

+  .3  in.  (+.8  cm) 

+.5  in.  (+1.3  cm) 

11.  Bideltoid  diameter 

+  .2  in.  (+.5  cm) 

+1.8  in.  (+4.6  cm) 

12.  Buttock-knee  length 

+  .2  in.  (+.5  cm) 

+.4  in.  (+1.0  cm) 

13.  Hip  breadth,  sitting 

+  .9  in.  (+2.3  cm) 

+1.8  in.  (+4.6  cm) 

14.  Popliteal  height,  sitting 

+  .2  in.  (+.5  cm) 

-.1  in.  (-.3  cm) 

15.  Buttock-leg  length ^ 

+1.4  in.  (+3.6  cm) 

+1.7  in.  (+4.3  cm) 

16.  Foot  length 

+1.4  in.  (+3.6  cm) 

+1.4  in.  (+3.6  cm) 

NOTES:  (1)  Clavicular  join*,  humeral,  radial,  hand  finger-grip  links. 

(2)  Lumbar,  thoracic,  vertical  neck,  lower  head,  upper  head  links. 

(3)  Femoral,  Tibial  foot  links. 
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SUMMARY  TABLE  OF  AVERAGE  FUGHT  CLOTHING/EQUIPMENT 
CORRECTION  FACTORS  FOR  JOINT-MOTION  REDUCTION  1 


ANGULAR  DIFFERENCE  DATA 


Difference*  in 
Summer 
Flight  Gear 


Joint 

No  FUght 
Gear  Average 

Neck:  elevation 

73 

declination 

61 

aximuth-right 

85 

asimuth-left 

85 

Arm:  elevation 

105 

declination 

152 

aximuth-right 

132 

axireuth-left 

55 

Elbow:  elevation 

116 

declination 

72 

asimuth-left 

63 

Wrist:  elevation 

61 

declination 

75 

aximuth-right 

44 

asimuth-left 

26 

Torso:  declination^ 

86 

torsion-right 

45 

torsion-left 

45 

Leg:  elevation 

46 

(femur)  aximuth-right 

7 

asimuth-left 

28 

Ankle:  elevation 

23 

declination 

15 

aximuth-right 

45 

asimuth-left 

40 

Differences  in 
Winter 
Flight  Gear 


NOTES:  (1)  Measured  in  degrees.  Corrections  are  ♦  from  right  arm  and  leg 
extremities,  left  side  mirror  image  is  assumed. 

(2)  Average  of  lumbar  and  thoracic  link  harness  unlocked. 
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SUMMARY  TABLE  OF  AVERAGE  FUGHT  CLOTHING/EQUIPMENT 
CORRECTION  FACTORS  REACH  ZONE  DATA  FOR  JOINT-MOTION  REDUCTION 


Zone  1 


Zone  2 


Zone  3 


Summer  Gear  32.1  in.  (81.5  cm)  36.8  in.  (93.5  cm)  43.5  in.  (110.5  cm) 
Winter  Gear  32.2  in.  (81.8  cm)  35.3  in.  (89.7  cm)  40.3  in.  (102.4  cm) 
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ANALYSIS  OF  QUESTIONNAIRE  DATA  TO  IDENTIFY 

"ACE"  attack  helicopter  pilots 


Robert  F.  Eastman 
Harie  Leger 
Brian  D.  Shipley,  Jr. 


US  Army  Research  Institute  Field  Unit 
Fort  Rucker,  Alabama  36362 


At  the  request  of  US  Army  Training  and  Doctrine  Command,  the  Army 
Rcf^arch  Institute  Fort  Rucker  Field  Unit  is  engaged  in  a  program  of 
research  to  determine  the  characteristics  and  traits  required  and/or 
desirable  for  combat  effective  attack  halicopta?  pilots.  The  acronym 
"ACE"  refers  to  AH-1  Combat  Effective  pilots  who  are  defined  by  combat 
performance  criteria  other  than  the  traditional  criterion  of  five  air- 
to-air  kills. 

The  construction  of  profiles  derived  from  the  study  of  combat 
proven  attack  pilots  and  the  development  of  methods  to  select  potential 
candidates  for  Attack  Helicopter  (AH-1)  transition  (raining  are  the 
objectives  of  the  "ACE"  program.  The  accomplishment  of  this  effort 
involves  the  following  three  interrelated  sub  tasks  which  are  being 
conducted  concurrently. 

Sub  Task  1.  Survey  of  Combat  Proven  Attack  Helicopter  Pilots  to 
Develop  Profiles  of  Potential  "ACE"  Pilots  and  Predictive  Instruments 
to  Select  Them. 

Sub  Task  2.  Development  of  Rating  Forms  for  Both  School  and  Unit 
Level  Abdication  to  Assess  Desirable  Traits  and  Characteristics  and 
Identify  Potential  Attack  Pilots  from  Among  Candidates  for  Training. 

Sub  Task  3.  Evaluation  and  Assessment  of  AH-1  Trainees  Against 
Characteristics  and  Traits  Determined  in  Research  Tasks  1  and  2. 

This  paper  describes  the  operational  procedures  and  results  of  ths 
survey  of  combat  proven  attack  helicopter  pilots  (Sub  Task  1).  However, 
the  results  and  conclusions  presented  should  bs  interpreted  heuristically , 
as  part  of  a  larger  research  project  In  progress,  which  will  be  finished 
with  the  completion  of  fellow  up  efforts. 

There  are  a  number  of  studies  available  to  indicate  that  a  measurable 
relationship  exists  between  attitude  variables  and  effectiveness  in  aerial 
combat  (e.g. ,  Knoell,  1953;  Trites,  et  al,  1953;  Strawbridge  and  Kahn,  195\‘ 
Torrence,  et  al,  1957;  Youngling,  et  al,  1977).  Likewise,  there  is  strong 


evidence  to  support  the-  notion  that  backgrovnd/biographical  material  is 
related  to  the  combat  success  of  aviators  (Bond  and  Burchell,  1944;  Torrance, 
(n,.  d.);  Torrance,  et  al,  1957;  Youngling,  et  al,  1977). 

None  of  the  studies  available  in  the  literature  has  dealt  with  the 
relationship  between  these  variables  and  the  combat  effectiveness  of  attack 
helicopter  pilots.  However,  the  working  assumption  was  made  that  attitude 
and  background  variables  would  also  apply  to  the  problem  of  identifying  the 
population  of  combat  effective  attack  helicopter  pilots. 

The  method  adapted  for  assessing  of  the  potential  value  of  these 
variables  was  to  mall  questionnaires  to  samples  of  "ACE"  pilots  and  controls 
and  then  analyze  their  responses  to  determine  if  these  variables  provide  a 
means  for  discriminating  between  individuals  in  the  two  groups.  -  The 
following  sections  will  describe  these  procedures  in  detail. 


METHOD 


SAMPLES. 

The  Army's  Military  Personnel  Center  (MILPERCEN)  provided  the  names  and 
unit  addresses  of  the  following  two  samples  of  aviators  from  Officer  Master 
File  Records. 

ACEs:  The  "ACE"  sample  (actually  a  sub-population)  consisting  of  all 
commissioned  and  warrant  officer  aviators  meeting  the  following  criteria: 

1.  Recipient  of  the  Silver  Star  or  a  higher  award  for  valor. 

2.  Served  in  Vietnam  during  the  period  1965-1972. 

j.  Attack  Helicopter  rated. 

A  total  of  280  officers  who  met  these  criteria  were  included  in  the 
"ACE"  group.  Only  aviators  on  active  duty  were  included. 

Controls:  The  control  group  consisted  of  a  "random"  sample  of 
commissioned  and  warrant  officer  aviators  meeting  the  following  criteria: 

1.  Had  not  received  the  Distinguished  Flying  Cross  or  a  higher  award 
for  valor. 

2.  Served  in  Vietnam  during  the  period  1965-1972. 

'  1,  Not  gunship  qualified. 

A  total  of  385  officers  who  met  these  criteria  were  included  in  the 
sample.  Only  aviators  who  were  still  on  active  duty  were  included  in  the 
study. 


SURVEY  MATERIALS. 


Addressees  in  the  ACE  and  Control  groups  all  received  a  set  of  question¬ 
naires  which  Included; 

a.  Letter  of  Instruction 

b.  Military  Background  Fora  (MBF)  15  Items 

c.  Background  and  Activities  Inventory  (BAX)  30  Items 

d.  Aviator  Attitude  Questionnaire  (AA)  35  Items 

e.  Self-Description  Fora  (SD)  25  Items 

The  letter  of  instruction  explained  the  purpose  of  the  survey  in  very 
general  terms,  assured  the  respondents  that  they  would  remain  anonyoraous, 
and  requested  that  ths  questionnaires  be  completed  and  returned  to  ARI  at 
Fort  Rucker.  The  Military  Background  Form  was  designed  to  get  information 
regarding  such  variables  as  the  aircraft  qualifications,  flying  experience, 
combat  experience  and  decorations  for  valor  of  the  respondents.  In  addition 
it  was  intended  to  provide  validation  of  the  samples  provided  by  M1LPERCEN. 

The  Background  and  Activities  Inventory  contains  items  selected  from 
existing  Army  inventories  (e.g. ,  the  Biographical  Inventory  of  the  Flight 
Aptitude  Selection  Tests  (FAST)  (Kaplan,  1965),  Interest-Opinion  Question¬ 
naire)  on  the  basis  of  preliminary  "hypothesis'*  obtained  from  the  research 
literature,  structured  interviews  of  Attack  pilots,  and  preliminary  item 
analysis  of  the  responses  of  more  than  50  Attack  pilots  as  compared  with 
student  pilots. 

The  Aviator  Attitude  Questionnaire  was  developed  from  the  content 
analysis  of  the  comments  of  10  Attack  pilots  during  unstructured  inter¬ 
views  at  Ft  Rucker.  All  items  in  the  questionnaire  were  mentioned  by  at 
least  two  of  the  pilots  interviewed. 

The  items  in  the  Self-Description  Form  were  drawn  primarily  from  tha 
Self-Description  Form  of  the  FAST  battery.  In  addition,  the  themes  for 
several  items  were  obtained  from  standardized  personality  tests  (Butcher, 
1969). 

The  Military  Background  Form  and  sample  items  from  each  of  the  question¬ 
naires  are  included  in  Appendix  A. 

PROCEDURE. 

The  packages  of  survey  questionnaires  were  mailed  to  ACEs  and  Controls 
during  a  four  month  period  from  Dec  76  to  March  77.  Addressees  in  both 
groups  received  an  identical  set  of  questionnaires,  except  that  each 
questionnaire  in  «  set  was  stamped  with  a  3  digit  Identification  number. 

The  two  samples  were  assigned  different  ID  number  sequences.  Therefore, 
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the  set  of  questionnaires  completed  by  a  single  Individual  all  bore  the 
sniiie  three  digit  number.  The  range  within  which  that  number  fell  was 
determined  by  fhe  group  the  individual  belonged  in  (e.g.,  001-449  Controls; 
501-999  ACK). 

As  questionnaires  were  returned,  data  was  coded  for  both  groups.  Survey 
returns  received  after  mid  June  were  not  included  i  the  analyses  reported 
below.  Seventy-four  percent  (741)  of  the  ACE  sample  (208)  and  581  of  the 
Controls  (224)  returned  completed  survey  questionnaires.  When  questionnaires 
which  were  returned  as  undelivernble  are  taken  into  account,  the  return 
rate  was  791  for  the  "ACE"  sample  and  651  for  the  Controls. 


RESULTS 

Analysis  of  Questionnaire  Items. * 

Anal> sis  of  categorical  data  of  the  MBF  and  the  personal  data  section 
of  the  EAI  reveal  several  significant  differences  between  the  ACEs  anu 
Controls;  the  responses  of  the  ACE  pilots  indicate  that  they  differ  signifi¬ 
cantly  from  Control  pilots  by  beings  (l)  of  higher  rank  (p  «,0002)»  (?.)  Have 
more  time  in  service,  (3)  more  likely  to  be  in  rhe  combat  arms  (pc.OOOl)*, 

(4)  have  more  combat  experience  (p<.0001),  (5)  more  likely  to  report  having 
higher  efficiency  ratings  (p  *.033),  and  (6)  lower  in  civil  education  level 
(p  *>.027)  (sec  Table  l  below).  Although  the  ACE  group  were  of  higher  rank 
(see  Table  1)  and  had  more  median  years  of  service  than  the  Non  ACE  Controls 
(14  yrs  vs  12  yrs,  p<.01),  ihcy  did  not  differ  significantly  from  the 
Controls  In  median  age  (ACE  »  35  yrs.  Controls  »  34  yrs).  The  combat  arms 
affiliation  and  combat  experience  variables  are  predictable,  and  do  con firm 
the  effectiveness  of  the  sampling  criteria  in  identifying  combat  experienced 
attack  pilots.  The  principal  source  of  the  difference  between  ACE  and  Control 
pilots  in  civil  education  is  due  to  the  smaller  percentage  of  ACEs  (35Z) 
than  Controls  (51Z)  who  have  a  college  degree.  The  questionnaire  items  for 
which  the  responses  of  the  ACEs  and  Controls  differ  significantly  are 
presented  in  Table  1.  The  BAI  items,  other  than  the  civil  education,  whi^h 
were  significant  indicate  that,  as  a  group,  the  ACKs  differ  from  the  Controls 
by  reporting  that  they  engage  and  excel  more  often  in  sports  and  activities 
of  a  physical  or  active  nature. 

The  results  also  indicate  that  differences  in  interest  Items  which  are 
not  of  a  physical  or  active  nature  do  not  discriminate  between  the  combat 
effectives  and  Controls.  None  of  the  items  asking  about  the  frequency  of 
engaging  In  activities  such  as  reading,  hobbies,  special  Interests,  etc. 


^It  was  possible  to  include  data  for  208  ACEs  and  222  Non  ACE  Controls 

in  the  analysis.  However,  because  of  plcsing  data  the  Ns  for  a 
specific  Item  w-rre  often  slightly  smaller. 
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following  scale: 
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B  ■  Once  or  Twice 
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nent  of  a  quad  track  for  cargo/lift  pilots  have  also  been  discussed.  These 
developments  will  have  the  following  effects  on  a  differential  selection 
program  for  Army  aviators:  (1)  emphasis  must  shift  to  selection  of  flight 
training  students  rather  than  experienced  aviators,  (2)  a  multiple  classifi-r 
cation  system  based  on  the  results  of  discriminant  analysis  of  effective 
aviator  specialists  must  be  developed.  This  means  that  questionnaire  items 
to  be  effective  with  IERW  students  will  have  to  be  general  in  nature,  i.e., 
they  cannot  be  tailored  to  the  opinions  and  attitudes  of  experienced  Army 
aviators  as  are  many  of  the  Aviation  Attitude  items.  The  multiplication 
of  tracks  will  probably  result  in  the  identification  of  a  second  discrimi¬ 
nant  function  reflecting  a  set  of  technical  specialization  items  to  supple¬ 
ment  the  combat  oriented  discriminant  function  identifiable  in  this  research. 
It  is  anticipated  that  the  centroids  of  the  aviation  specialities  which 
would  result  from  a  multi-track  system  would  be  meaningfully  represented 
in  such  a  discriminant  space. 


SUMMARY 


The  responses  of  208  combat  effective  at'.ack  helicopter  pilots  (ACEs) 
and  222  control  pilots  to  four  questionnaires,  (1)  Military  Background 
Form,  (2)  Background  and  Activities  Inventory,  (3)  Aviator  Attitude  Question¬ 
naire  and  (4)  Self  Description  Form,  were  analyzed.  Items  on  which  the  two 
groups  differed  significantly  were  identified.  It  is  anticipated  that  ths 
items  will  provide  the  content  for  the  construction  of  profiles  of  indiv¬ 
iduals  who  are  suited  for  training  as  attack  pilots.  Discriminant  analysis 
of  the  data  Indicates  that  a  single,  shorter  questionnaire  can  be  developed 
to  classify  pilots  as  potential  AH-1  combat  effectives (ACEs)  or  non-ACEs. 


produced  any  significant  differences  between  -the  two  groups.  The  Aviator 
Attitude  items  that  were  the  most  discriminating,  as  noted  above,  were 
derived  from  the  analysis  of  comments  by  combat  experienced  attack  pilots 
obtained  during  Interviews.  Because  of  this,  much  of  their  effectiveness 
may  be  limited  to  selection  of  experienced  aviators  for  advanced  training. 
However,  some  of  the  more  significant  items  may  be  of  a  general  nature, 
(e.g.,  "I  enjoy  the  power  of  weapons").  The  differences  between  the  ACEs 
and  Controls  on  these  items  are  consistent  with  stereotyped  notions  of  how 
the  aggressive  combat  perforu«r  "should"  differ  from  his  less  aggressive 
peers. 

Discriminant  Analysis. 

Discriminant  analysis  was  applied  to  the  questionnaire  data  to  Identify 
the  combination  of  items  that  best  dif ferentitate  between  the  "ACE"  and 
Control  helicopter  pilots.  The  stepwise  discriminant  analysis  procedure 
was  used  (MINRESID  method)  to  eliminate  less  useful  items  from  the  dis¬ 
criminant  functions  (Hie,  et  al,  1975).  An  arbitrary  maximum  number  of 
15  steps  were  specified  for  the  analysis.  This  was  done  to  obtain  a  re¬ 
duced  set  of  items  for  inclusion  in  a  single  new  Instrument  along  with 
additional  "filler"  and  untried  items.  The  summary  table  for  the  analysis 
is  shown  in  Table  2.  As  indicated  in  Che  table  by  the  15th  step  the 


TAB  IE  2 

SUHMARY  TABLE  OF  DISCRIMINANT  ANALYSIS  OF  QUESTIONNAIRE  DATA 


Step 

Item1 

Entered 

F  to  Enter 
Or  Remove 

1 

AA  04 

94.26 

2 

AA  07 

14.22 

3 

J  13 

9.76 

4 

BAI  16 

10.97 

5 

AA  34 

8.06 

6 

AA  01 

7.41 

7 

SD  02 

6.21 

8 

BAI  14 

6.43 

9 

BAI  04 

4.80 

10 

AA  16 

6.32 

11 

AA  02 

4.87 

12 

BAt  27 

3.82 

13 

SD  10 

3.13 

u 

SD  08 

2.77 

15 

AA  ?4 

2.51 

Change  In 

Rao*s  V  Significance 


94.23 

<  .001 

17.98 

<  .001 

12.86 

<  .001 

14.89 

<  .001 

11.31 

<  .001 

10.66 

<  .001 

9.13 

<  .003 

9.65 

<  .002 

7.36 

<  .GO 7 

9.84 

<  .002 

7.75 

<  .005 

6.18 

<  .013 

5.12 

<  .024 

4.60 

<  .032 

4.21 

<  .040 

See  Appendix  A  for  content  of  items. 


contribution  of  -any  additional  variables  to  the  discrimination  was  approach¬ 
ing  insignificance  (change  in  Rao's  V,  p  »,04).  The  items  selected  for 
inclusion  in  the  discriminant  function  were  for  the  most  part  Included  in 
the  set  of  statistically  significant  items  in  Table  1.  In  addition,  several 
apparent  suppressor  items  were  selected.  The  complete  set  of  15  items  are 
presented  in  Appendix  A.  The  set  of  original  variables  Included  in  the 
discriminant  function  were  then  used  to  classify  members  of  both  the  original 
samples  to  see  how  many  would  be  correctly  classified  into  their  actual  group. 
By  this  process  the  probability  of  belonging  to  one  or  the  other  groups  is 
calculated  by  separate  linear  combination  of  the  variables  for  each  indiv¬ 
idual  case,  then  assignment  is  mads  to  the  group  with  the  highest  probability. 
As  is  shown  in  Table  3,  this  procedure  resulted  in  a  SOX  probability  of  correct 
classification  of  the  ACE  and  Control  pilots  in  their  actual  groups. 


TABLE  3  CLASSIFICATION  OF  GROUP  MEMBERSHIP 


Actual  Grout 


Predicted  Grout 


ACE  Control 


*******  -msp 


.15. 3X 


CONTROLS 


24. 5X  75. SX 


A  total  of  69. 9X  of  cases  correctly  classified 


This  relatively  high  X  of  correct  classification  is  a  necessary 
prerequisite  for  using  the  discriminart  function  to  predict  or  classify 
a  new  or  unknown  sample  of  aviators  Into  the  group  their  responses  in¬ 
dicate  they  are  most  suited  for. 


DISCUSSION  AND  CONCLUSIONS 


The  analysis  of  tht-  questionnaire  data  indicates  that  personality, 
attitude  and  background  measures  can  be  used  to  discriminate  between 
combat  effective  attack  helicopter  pilots  and  control  pilots.  It  Is 
apparent  that  further  research  to  develop  instruments  to  select  helicopter 
pilots  for  attack  training  is  worth  pursuing.  It  is  anticipated  that,  in 
addition  to  providing  training  selection/classiflcotion  instruments',  the 
content  of  the  items  can  provide  the  basis  for  developing  representative 
profiles  to  assist  unit  commanders  in  assigning  aviators  tor  training  as 
attack  pilots. 

The  BAI  items,  for  example,  suggest  that  ACE  pilots  era  more  likely 
to  report  that  they  participate,  and  ere  r-oficient,  in  phyeical/aggressive 
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activities.  Target  shooting,  combatives  (boxing,  wrestling)  and  strength/ 
endurance/fitness  type  activities  are  all  wore  typical  of  the  combat 
effective.  This  suggests  that  the  AGE  pilots  are  individuals  who  have 
an  image  of  themselves  as  highly  physically  active,  aggressive  and  are 
confident  in  their  ability  to  perform  activities  which  conform  to  this 
image  (whether  objectively  justified  or  not). 

The  SD  items  for  which  there  was  a  significant  difference  between  ACE 
and  Control  pilots  indicate  that  the  combat  effectives  are  more  likely  to 
be  active,  argumentative,  patriotic  individuals  who  like  taking  risks,  but 
want  to  feel  they  are  in  control  of  their  actions,  'liny  are  also  more 
quick  to  anger  and  be  vindictive  than  the  Non  ACE  Controls. 

The  AA  items,  as  mentiowed  above,  are  aimed  at  tapping  the  attitudes 
of  experienced  Army  aviators  and  to  that  extent  are  limited  in  reflecting 
more  general  attitudes.  However,  the  following  generalisations  are  possible: 
The  "ACE"  pilots  as  a  group  are  more  likely  (!)  to  express  a  desire  for 
combat  duties,  (2)  enjoy  power  of  weapons,  (3)  feel  that  their  job  is  important 
or  critical,  (4)  to  feel  that  aggressiveness  is  important  for  combat,  and 
(5)  enjoy  heated  arguments. 

The  adequacy  of  the  discriminant  function  in  classifying  the  original 
set  of  cases  was  relatively  high  (SOX).  Tills  indicates  that  the  discrim¬ 
inant  function  derived  from  the  analysis  of  ACE  and  Non  ACE  Control  pilots 
can  be  used  to  classify  aviators  of  unknown  group  membership.  This  will  be 
accomplished  by  calculating  two  classification  scores,  i.e,,  ACE  and  Not  ACE, 
for  each  <.vin«r  and  classifying  him  Into  the  group  with  the  highest  score. 

In  this  way  questionnaires  mn  be  used  in  conjunction  with  other  information 
and  data  (e.g.,  ratings),  to  assist'  twr-'anders  in  assigning  aviators  to  types 
advanced  training  (e.g..  Attack  AH-l  vs  Cargo/ lift  01-47). 

Research  is  currently  underway  to  determine  the  predictive  validity  of 
a  rating  form  for  AH-l  pilot  candidates  (Eastman  and  McMullen,  1976).  As 
an  indirect  result  of  thiv  research,  the  training  performance  grades  of  • 
large  sample  of  AH-l  pilots  will  be  available.  Follow-up  research  is  planned 
to  try  and  classify  high  scoring  All-1  graduates  and  a  group  of  non  AH-l  rated 
pilots  using  the  classification  functions  developed  from  the  discriminant 
analysis  of  these  surveys.  In  addition,  new  itema  will  be  added  to  assess 
areas  not  included  in  the  present  questionnaires.  These  will  include  the 
following:  (1)  question  the  aviators  about  childhood  experiences  regarding 
risk  taking,  aggressive  behavior  and  troublemaking  behavior,  (2)  measures  of 
confidence  In  physical  prowess,  and  resistance  to  stress  and  (3)  ask  whether  . 
he  "sought  out"  or  "drifted  into"  a  military  career. 

The  research  objectives  of  this  program  ware  to  develop  instruments 
for  selecting  aviators  for  advanced  training.  However,  a  development  In 
(he  Army's  Initial  Entry  Rotary  Wing  (IERW)  program  has  been  the  Initiation 
(in  June  76)  of  a  dual  track  program  in  which  2SX  of  the  students  will  be 
trained  in  the  OH-58  Scout  for  the  fl-^al  (tactics)  phase  of  IERW;  K  triple 
track  program  in  which  another  sizable  percentage  of  IERW  students  will  be 
tracked  into  an  AH-l  attack  helicopter  transition  and  tactical  training  as 
an  attack  pilot  has  been  scheduled  for  1978.  Plans  for  eventual  establish- 


ment  of  a  quad  track  for  cargo/lift  pilots  have  also  been  discussed.  These 
development#  will  have  the  following  effects  on  a  differential  selection 
program  for  Army  aviators:  (1)  emphasis  must  shift  to  selection  of  flight 
training  students  rather  than  experienced  aviators,  (2)  a  multiple  classifi¬ 
cation  system  based  on  the  results  of  discriminant  analysis  of  effective 
aviator  specialists  must  be  developed.  This  means  that  questionnaire  items 
to  be  effective  with  1ERU  students  will  have  to  be  general  In  nature,  i.e,, 
they  cannot  be  tailored  to  the  opinions  and  attitudes  of  experienced  Army 
aviators  as  are  many  of  the  Aviation  Attitude  items.  The  multiplication 
of  tracks  will  probably  result  in  the  identification  of  a  second  discrimi¬ 
nant  function  reflecting  a  set  of  technical  specialization  items  to  supple¬ 
ment  the  combat  oriented  discriminant  function  identifiable  in  this  research. 
It  is  anticipated  that  the  centroids  of  the  aviation  specialities  which 
would  result  from  a  multi-track  system  would  be  meaningfully  represented 
in  such  a  discriminant  space. 


SUMMARY 


The  responses  of  208  combat  effective  at'.ack  helicopter  pilots  (ACEs) 
and  222  control  pilots  to  four  questionnaires,  (1)  Military  Background 
Form,  (2)  Background  and  Activities  Inventory,  (3)  Aviator  Attitude  Question¬ 
naire  and  (4)  Self  Description  Form,  were  analyzed.  Items  on  which  the  two 
groups  differed  significantly  were  identified.  It  is  anticipated  that  the 
items  will  provide  the  content  for  the  construction  of  profiles  of  indiv¬ 
iduals  who  are  suited  for  training  as  attack  pilots.  Discriminant  analysis 
of  the  data  Indicates  that  a  single,  shorter  questionnaire  can  be  developed 
to  classify  pilots  as  potential  AH-1  combat  effectives(ACEs)  or  non-ACEs. 


977 


,  REFERENCES 


Bond  (Major)  and  Burchell  (Major)  A  study  of  100  successful  airmen  with 
particular  respect  to  their  motivation  and  resistance  to  combat  stress. 
Report  of  Project  #22.  submitted  to:  Surgeon.  HQ,  Eight  Air  Force, 
December  1944. 

Butcher,  J.  N,  (ed.)  MMPT;  Research  Development  and  Clinical  Applications. 
New  York:  McGraw  Hill ,  1969. 

Eastman,  R.  F.  and  McMullen,  R.  L.  Reliability  of  associate  ratings  of 
performance  potential  by  Army  aviators.  Presented  at  the  18th  Confcr- 
ence  of  the  Military  Testing  Association,  18-22  October,  1975 
(available  as  ARI  Research  Memorandum  76-28,  Nov  1976.) 

Kaplan,  Harry.  Prediction  or  success  in  Army  aviation  training.  ARI 
Technical  Research  Report  1142,  June  1965, 

Knoell,  Dorothy  M.  Relationships  between  attitudes  of  bomber  crews  in 
training  and  their  attitudes  and  performance  in  combat  (AFPTRC-TN- 

56- 49).  Lackland  Air  Force  Base,  Texas;  Air  Force  Personnel  and 
Training  Research  Center,  April  1,956. 

Nie,  X.  H.,  Hull,  C.  H.,  Jenkins,  J.  G.,  Stelnbrenner,  K.  and  Bent,  D.  H. 
SPSS  Statistical  Package  for  the  Social  Sciences,  2nd  Ed.,  Nev  York: 
McCraw  Hill,  1975. 

Strawbrldge  and  Kahn.  Fighter  pilot  performance  in  Korea.  Chicago, 
Illinois  Institute  for  Air  Weapons  Research,  November  1955  (IAWR 
Report  #55-10). 

Torrance,  E.  P.  The  development  of  a  preliminary  life  experience  inven¬ 
tory  for  the  study  of  fighter  effectiveness.  Lackland  Air  Force  Base, 
Texas. 

Torrance,  E.  P.,  Rush,  C.  H.,  Kohn,  H.  B.,  Doughty,  J.  M.  (AF  PTRC-TR- 

57- 11) .  Factors  in  fighter  interceptor  pilot  combat  effectiveness. 
1957,  NTIS* No.  AD-146~Zo7. 

Trltes,  D.  K.,  Holtraan,  V.  H.,  Templeton,  R.  C.,  Sells,  S.  B.  Psychi¬ 
atric  Screening  of  Flying  Personnel:  Research  on  the  SAM  Sentence 
Completion  Tust.  (AF  Project  No.  21-0202-0007,  #3),  USAF  School  of 
Aviation  Medicine,  Randolph  Field,  Texas,  July  1953. 

Youngling,  E.  W. ,  Levine,  S.  H.,  Mochamuk,  J.  B.  and  Weston, .L.  M* 
Feasibility  Study  to  predict  combat  effectiveness  for  selected 
military  roles:  fighter  pilot  affectivenesa.  HOC  E1634, 

29  April  1977. 


978 


APPENDIX  A 


ITEM  ID 
AA04 

AA07 

SD  13 

BAI  16 


AA  34 

AA  01 

SD  02 


ITEMS  SELECTED  FOR  DISCRIMINANT  FUNCTION 


(Check) 


ITEM  CONTENT 


T  -  True 
F  «  False 


If  I  have  to  be  in  a  combat  assignment 
I  want  to  be  doing  the  shooting* 


T  “  True  In  case  of  another  war  I  want  a  combat 

aviation  assignment. 

F  -  False 


Y  -  Describe  you  My  behavior  is  largely  controlled  by  the 

customs  of  wy  society. 

N  «  Does  not  describe  you 


How  often  have  you  Sport  parachuting  or  sky  diving 
participa ted? 


A 

B 

C 

D 

T 

F 

T 

F 

Y 


Never 

Once  or  twice 
A  number  of  times 
Frequently 


Exceptionally  good  eyes  and  hands  are  more 
important  than  aggressiveness  to  a  gun- 
ship  pilot. 


Some  people  are  naturally  adapted  to 
combat. 


True 

False 

True 

False 


Describes  you  I  don’t  like  to  argue. 


N  *  Does  not  describe  you 


BA  I  14 


BA1  04 


A1  16 


How  often  have  you  Surfboard  riding 
participated? 


A.  '  Sever 

B.  Once  or  Twice 

C.  A  number  of  times 


D.  Frequently 


T  -  True 
F  -  False 


How  far  did  you  go  in  school  before  you 
came  into  the  Army  (or  other  branch  of 
armed  forces)  on  extended  active  duty? 

A*  Less  than  high  school 

B.  High  school  graduate 

C.  High  school  equivalent  (GED  or  other 
equivalent) • 

D.  College,  less  than  two  years 

E.  College,  two  years  or  more 

F.  College  degree 

G.  Graduate  work,  no  degree 

H.  Graduate  or  professional  degree 

The  best  scout  and  gunship  pilots  have 
Btrong  suicidal  tendencies. 


AA  02  T  -  True 

F  •  False 


A  guy  r«y  be  s  turkey  when  it  comes  to 
aircraft  control,  but  if  he’s  an  aggres^ 
sive  competitor  he’ll  be  a  good  combat  pilot* 


BA1  27  How  well  do  you  Weight  lifting  or  strength  exercises, 

perform? 

A,  Outstanding 

B,  Well 

C,  Adequately 

,  D.  Poorly 


E,  bo  not  engage  actively 


SD  10 


Y  -  Describe*  you  I  sometimes  tease  animal* 

N  *  Does  not  describe  you 


SD  08  Y  <*  Describes  you  I  don't  like  doing  things  on  the  spur 

of  the  moment, 

N  *  Does  not  dercribe 
you 


AA  24  T  ■  True 

F  -  False 


I'm  among  the  best  but  the  Army  isn't 
giving  me  any  credit  for  it. 
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Pilot  Selection  Research  in  the  Air  Force 

David  R.  Hunter 
Personnel  Research  division 
Air  Force  Human  Resources  Laboratory 
Brooks  Air  Force  Base*  Texas 

As  part  of  the  mission  of  the  Personnel  Research  Division  of  the  Air 
Force  Human  Resources  Laboratory,  continuing  studies  have  been  conducted 
to  investigate  wavs  to  improve  the  selection  procedures  for  admission  to 
Undergraduate  Pilot  Training  (UPT) .  This  research  has  included  attempts 
to  improve  the  existing  paper-end -pencil  selection  measures,  investiga¬ 
tions  into  the  use  of  new  solid-state  paychomotor  apparatus  tests,  and 
evaluations  of  learning  ability  through  the  use  of  a  light-plane  simula¬ 
tor.  This  report  trill  outline  the  research  that  has  been  performed  by 
the  Personnel  Research  Division  in  these  areas  and  will  indicate  some  of 
the  tasks  that  remain  to  be  addressed. 

Since  the  early  1950’s,  when  paychomotor  testing  was  discontinued, 
the  principal  selection  instrument  fov  pilot  training  has  been  the  Pilot 
Composite  of  the  Air  Force  Officer  Qualifying  Test  (AFOQT) .  This  com¬ 
posite,  in  the  previous  (Fora  M)  version  of  the  AFOQT,  consisted  of  the 
seven  subscales  shown  in  Table  1.  Also  shown  in  Table  1  are  the  eight 
subscales  which  comprise  the  pilot  composite  in  the  version  (Form  N)  of 
the  AFOQT  which  will  shortly  become  operational. 

As  can  be  ween  from  comparing  the^e  two  lists,  there  is  a  consider¬ 
able  difference  between  the  two  forms  This  modification  arose  out  of  a 
search  for  means  to  improve  the  predictive  validity  of  the  Pilot  Composite 
through  the  use  of  different  item  types. 
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An  experimenta.'..  reference  battery  consisting  of  the  21  scales  listed 
in  Table  2  was  administered  to  a  sample  of  officers  and  office  trainees 
slated  to  attend  UPT  and  validated  against  their  performsnee.  The 
correlations  for  each  of  these  scales  with  a  dichotomous  Pass/Fail 
criterion  and  a  dichotomous  Pass/Flying  Training  Deficiency  (FID) 
elimination  criterion  are  also  presented  in  Table  2.  Based  upon  the 
results  of  this  study,  five  new  scales  were  selected  for  the  Form  N  AFOQT 

Additionally,  as  a  result  of  research  performed  by  Guinn,  Vi to la, 
and  Lelsey  (1976),  a  Biographical  and  Attitude  Scale  was  included.  This 
scale  was  developed  as  a  result  of  research  using  the  Strong  Vocational 
Interest  Blank  (SVID)  and  the  Officer  Background  and  Attitude  Survey. 

The  correlations  between  measures  developed  by  Guinn  et  al.  from  these 
two  instruments  and  the  two  training  criteria  are  shown  in  Table  3.  In 
contrast  to  most  of  the  measures  developed  for  screening  personnel 
entering  UPT,  these  scales  seem  to  have  greater  validity  for  prediction 
of  overall  attrition  than  for  Flying  Training  Deficiency  elimination, 
thus  possibly  indicating  that  they  do  not  tap  the  abilities  related  to 
flying  skill  but  rather  the  attitudes  and  habits  that  allow  one  to 
succeed . 

Table  4  presents  the  multiple  correlations  between  the  paper-and- 
pencil  measures  and  tha  Pass/Fail  UPT  criterion.  As  can  be  seen  from 
this  table,  the  multiple  correlations  are  typically  of  a  rather  low  order 
and,  in  most  cases,  did  not  achieve  statistical  significance.  Only  the 
SVIB  consistently  makes  a  significant  contribution  to  prediction  of  the 
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criterion;  however,  the  Multiple  correlation  reported  for  the  AFOQT  it 
seriously  attenuated  by  restriction  of  range  due  to  preselection  using 
this  test. 

To  examine  the  relative  contributions  of  these  Measures,  the  null 
hypotheses  listed  at  the  bottom  of  Table  A  were  tested  with  the  noted 
results. 

In  general,  it  seen*  that  it  would  bo  possible  to  obtain  signifi¬ 
cant  increases  in  the  predictive  validity  of  the  AFOQT  through  the 
addition  of  a  scale  Measuring  attitudes  and  interests,  and  this  has  been 
done  for  the  latest  revision  of  the  AFOQT. 

The  second  area  of  pilot  selection  research  has  been  concerned  with 
the  measurement  of  psychomotor  abilities  and  their  relation  to  success 
in  UPT.  Sanders,  Valentine,  and  HcCrevy  (1971)  and  McGrevy  and  Valentine 
(197A)  have  reported  on  the  development  and  validation  of  two  aircrew 
peychomctor  tests  which  have  shown  promise  as  possible  instruments  for 
pilot  selection. 

The  first  of  these  tests,  Two -Hand  Coordination,  requires  that  the 
subject  track  a  moving  target  with  a  small  X  shaped  cursor,  using  two 
hand  joysticks.  The  right-hand  joystick  controls  the  movement  of  the 
cursor  in  the  right-left  (X)  coordinate,  while  the  left-hand  joystick 
controls  trie  movement  of  the  cursor  in  the  up-down  (Y)  coordinate. 

Figure  1  shows  the  display  used  in  this  test  and  in  the  second  test— 
Complex  Coordination. 

The  Complex  Coordination  test  involves  a  compensatory  tracking  task 
lr,  which  the  subject  controls  the  X  and  Y  displacements  of  a  cursor  with 
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a  single,  large  iloor-mounted  joystick.  The  suoject's  task  is  to  keep 
the  cursor  as  close  as  possible  to  the  intersection  of  a  vertical  and 
horizontal  line  of  dots.  At  the  sane  tine,  he  must  use  a  rudder  bar  to 
keep  a  short  bar  of  light  aligned  with  the  vertical  row  of  dots. 

Scores  for  both  of  these  tests  consist  of  the  sunned  absolute  die- 
placenents  (in  CRT  units,  approximately  .(>1  inch)  from  the  target  point 
to  the  cursor  in  the  X  and  Y  axes  and  the  right-left  axis  (Z-axia)  for 
the  rudder  bar.  These  displacements  are  sunned  over  the  five  1 -minute 
periods  of  the  tests  for  the  X,  Y,  and  Z  axes  separately. 

Table  5  presents  the  correlations  of  these  measures  with  UPT 
criteria  for  two  independent  samples .  The  first  sanple  (from  an  unpub¬ 
lished  study  by  McGrevy  &  Valentine,  1975)  consisted  primarily  of  officer 
trainees  slated  to  attend  UPT.  The  correlations  between  scopes  from 
the  fourth  and  fifth  minutes  of  the  two  tests  for  the  X,  Y,  and  i  axes 
with  UPT  Paas/Fail  and  with  Flying  Training  Deficiency  Elimination  versus 
any  other  disposition. 

The  second  sample  consisted  principally  of  officers  about  to  attend 
the  Flight  Screening  Program  (FSP)  at  Hondo,  Texas,  The  FSP  program  is 
the  first  phase  of  UPT  and  consists  of  about  15  hours  of  Instruction  in 
a  T-41  aircraft. 

Correlations  are  reported  here  for  the  arithmetic  set.  of  the  scores 
from  minutes  four  and  five  for  the  two  tests.  As  can  be  seen  from  this 
table,  these  two  teste  correlate  significantly  and  consistently  with 
performance  in  UPT, 
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As  s  result  of  the  initial  studies  conducted  using  these  two  tests, 
which,  used  a  snail  minocomputsr  for  the  generation  and  scoring  of  the 
tests,  two  new  portable  test  devices  were  obtained.  One  of  these  devices 
is  shown  in  Figure  2.  These  devices  are  entirely  self -coy t« ined  and  use 
solid  state  electronic  components  to  increase  reliability  and  decrease 
problems  of  calibration  which  contributed  to  the  discontinuance  of 
psychomotor  testing  for  the  selection  of  pilot  trainees  in  th  trly 
1950's.  A  continuing  research  program  is  under  way  to  assess  the 
validity  and  reliability  of  these  devices,  with  the  expectation  that  they 
may  ba  included  in  the  screening  process  for  the  selection  of  pilot 
trainees . 

The  third  asjor  area  of  pilot  selection  research  has  focused  on  the 
evaluation  of  learning  ability  using  an  experimental  task  highly  sisllsr 
to  that  Involved  in  pilot  training.  An  Automated  Pilot  Aptitude 
Measurement  System  (GAT)  was  devised  by  Long  and  Varney  (1975),  which 
utilised  two  light-plane  simulators  (Link-Singer  General  Aviation  Trainers) 
Interfaced  to  a  minicomputer  to  collect  performance  dete  on  e  number  of 
flight  parameters  during  the  course  of  a  5-hour  syllabus  of  instruction  on 
how  to  fly  the  simulators. 

TWo  studies  were  conducted  using  this  system.  The  first  study  wss 
reported  by  Long  end  Varney  (1975),  and  data  from  that  study  has  bean 
reanalysed  so  aa  to  ensure  comparability  with  the  data  analysis  performed 
on  the  second  study.  Because  of  the  large  number  of  variables  (190) 
collected  on  each  subject  by  the  system,  some  method  of  data  reduction 
ias  considered  desirable.  The  Long  and  Varney  study  used  s  factor 
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analytic  approach  to  reduce  the  number  of  variablee;  therefore,  that 
approach  was  replicated  in  the  reanalysis  of  the  data, 

A  principal  component*  factor  analysis,  followed  by  Varimax  rotation, 
was  per fo reed  on  the  190  variables  obtained  frost  the  system.  Figure  3 
shows  the  eigenvalues  for  the  first  ten  unrotated  factors.  Figure  4 
shows  the  proportion  of  variance  for  the  first  ten  rotated  factors.  From 
an  examination  of  these  two  figures,  it  would  seem  that  further  examina¬ 
tion  should  be  limited  to  the  first  six  rotated  factors.  Based  upon 
their  loadings  with  the  raw  variables,  these  first  six  factors  may  be 
identified  as  shown  in  Table  6. 

The  correlations  of  these  six  factors  with  the  two  UPT  criteria  are 
shown  in  Tables  7  and  8  for  Study  I  and  Study  II,  respectively.  There  is 
considerable  variation  in  the  validities  from  one  study  to  the  next,  aud 
this  can  be  attributed,  at  least  in  part,  to  the  instability  of  factors 
determined  from  190  variables  while  using  only  140  subjects. 

Because  of  this  instability,  it  was  considered  desirable  to  develop 
a  simpler,  more  stable  data  reduction  procedure.  The  simplest  aud  moat 
obvious  procedure  was,  of  course,  to  simply  form  sums  or  averages  of  the 
sets  of  the  190  variables  that  were  in  the  same  metric — that  is,  to 
simply  take  the  average  of  all  deviations  from  command  heading,  all 
deviations  from  comnand  altitude,  and  so  on.  Tables  9  and  10  show  the 
correlations  between  there  simple  GAT  scores  and  the  two  training 
criteria  for  the  two  studies. 

The  correlations  here  are  as  high,  or  higher,  than  the  c?«ea  reported 
for  the  CAT  factors,  while  retaining  their  validity  across  the  two  studies. 
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The  Multiple  correlations  of  these  five  simple  scores  with  the  tvo 
criteria  are  given  in  Table  11  for  Study  I.  Table  12  shows  the  multiple 
ccx relations  of  the  six  factor  scores  froa  Study  I  and  also  the  cross - 
validated  multiple  correlations  obtained  from  Study  XI.  These  results 
show  that  the  GAT  factors  did  not  retain  their  validity  for  the  prediction 
of  the  Paas/Fail  criterion}  and  there  was  a  substantial  decrease  in 
validity  for  the  prediction  of  the  Paas/FTD  criterion.  As  noted  earlier* 
these  decreases  may  be  attributed,  in  part,  to  the  instability  of  the 
factors  based  upon  such  a  relatively  small  sample.  While  the  cross - 
validated  multiple  correlations  for  the  staple  GAT  scores  are  not  yet 
available.  It  is  expected  that  there  will  be  considerably  less  shrinkage 
in  these  correlations.  Table  13  gives  an  approximation  of  these  shrunken 
multiple  correlations,  calculated  using  the  standard  correction  for 
shrinkage. 

In  Table  14,  the  multiple  correlations  are  reported  lor  some  of  the 
combinations  of  variables  used  in  Study  1.  In  general,  it  seems  that 
the  beat  coobir.it ion  to  achieve  maximum  validity  ahould  Involve  the 
psychoaotor  and  GAT  variables.  Although  the  GAT  factor  scores  ere  used 
in  these  regression  problems ,  the  same  ordar  of  validity  should  ba  achieved 
through  using  the  simple  GAT  variables,  with  considerably  lass  shrinkage . 

While  the  GAT  variables  can  make  significant  contributions  to  UPT 
prediction,  analysts  performed  by  AFHRL  staff  members  hava  indicated 
that  tha  usa  cf  a  selection  system  which  included  the  GAT  along  with  paper- 
and-pencil  and  psychomotor  measures,  may  prove  economically  unf  sawable 
under  the  present  pilot  flow  conditions,  as  comparsd  with  a  seise  tion  system 
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which  consists  of  only  the  paper-and-peacil  and  psychomotor  measures.  This 
is  due  not  so  much  to  the  cost  of  the  testing  apparatus,  but  rather  to  the 
cost  of  transporting  applicants  to  the  testing  sites  and  providing  for 
them  while  they  are  undergoing  testing.  Because  of  these  considerations,  further 
research  is  contemplated  which  will  examine  the  degree  to  which  validity 
comparable  to  that  obtained  with  the  CAT  aay  be  obtained  from  smaller, 
more  portable,  low  fidelity  devices.  Specifically,  whether  a  "desk-top” 
light-plane  simulator,  without  motion  base,  or  possibly  a  simple  cathode 
ray  tube  dicplay  and  small  minicomputer  ulftht  be  used  to  replace  the  GAT 
system  while  measuring  tha  same  critical  lemming  abilities. 

In  conclusion,  than,  it  has  been  found  that  the  validity  of  paper- 
and-pencil  measures  may  be  increased  thiuugh  the  addition  of  items  which 
address  the  background  and  attitudes  of  the  applicants.  Furthermore, 
the  use  of  psychomotor  measures  can  make  a  significant  contribution 
through  the  measurement  of  abilities  not  tapped  by  pa per-and -pencil 
measures. 

The  use  of  learning  sample  measures,  such  as  those  obtained  from  the 
GAT  system,  can  also  make  significant  Independent  contributions  to 
Increased  validity;  however,  additional  research  must  be  performed  to 
develop  a  more  cost  effective  vehicle  for  this  type  of  testing. 
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Figure  3.  Proportion  of  Variance  by  Rotated  GAT  Factors  -  Study  I 


Table  1.  Pilot  Composite  Subscales 


_ Font  H _ 

Mechanical  Information 
Mechanical  Principles 
Pilot  Biographical  Inventory 
Aviation  Information 
Visualisation  of  Maueuvers 
Instrument  Comprehension 
Stick  and  Rudder  Orientation 


Form  N _ 

Verbal  Analogies 

Table  Reading 

Electrical  Mass 

Block  Counting 

Scale  Reading 

Mechanical  Comprehension 

Instrument  Comprehension 

Pilot  Biographical  and 
Attitude  Scale 
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Table  2.  Experimental  Paper-and-Pencil  Scales 

(N  -  245) 


Scale 

Pass/Fall 

FTD/Other 

Scale  Reading 

.19 

-.16 

Letter  Seta 

.10 

-.15 

Tool  Function* 

.04 

-.08 

Electrical  Information 

.02 

-.10 

Mechanical  Principles 

.10 

-.12 

Word  Knowledge 

.03 

-.15 

Word  Grouping 

-.01 

o 

• 

1 

Verbal  Analogies 

.13 

-.19 

Block  Counting 

.18 

-.15 

Point  Distance 

.04 

i 

• 

o 

O' 

Electrical  Maze 

.13 

-.14 

Pattern  Detail 

.07 

-.14 

Rotated  Blocks 

.08 

-.10 

Tools 

.04 

.02 

Figure  Analogies 

-.01 

-.04 

Hidden  Figures 

.05 

fO 

o 

• 

i 

Answer  Sheet  Marking 

.05 

-.09 

Table  Reading 

.17 

-.08 

Large  Tapping 

.05 

-.30 

Trace  Tapping 

.05 

-.03 

Discrimination-Reaction 

.06 

CM 

O 

1 

r 

i 


I 


i 

1 
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Pus /Fail  Paas/FTD 


(N  -  265) 

(N  -  227) 

SVIB 

Kay  A 

.13* 

.06 

Kay  B 

.16* 

.09 

Kay  C 

t© 

o 

a 

i 

-.01 

Pass/Fcil 

Paaa/FTD 

OBAS 

(N  -  257) 

*  220) 

Total  Examination  Scale 

.15* 

.03 

Flying  Deficiency  Scale 

.13* 

.03 
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Table  4.  Regreaaion  Problems  Using  Paper-and-Pencil 
Measures  for  UPT  Pass /Fail  Criterion 


Problem 

Number 

Predictors 

N 

No.  of 
Predictors 

R 

1 

AFOQT 

715 

5 

.12 

2 

Reference  Battery 

745 

21 

.20 

3 

OBAS 

257 

2 

.15 

4 

SVIB 

265 

3 

.18* 

5 

OBAS  &  AFOQT 

206 

7 

.23 

6 

OBAS  &  Reference  Battery 

250 

23 

.25 

7 

SVIB  &  AFOQT 

214 

8 

.24 

8 

SVIB  6  Reference  Battery 

258 

24 

.28 

9 

SVIP  4  OBAS 

256 

5 

.23* 

Field  Res- 


Null  Hypotheses 

Model 

trie ted 

dfi 

df2 

F  Ratio 

OBAS  makes 

no 

contribution 

to  AFOQT 

5 

1 

2 

198 

4.02* 

SVIB  makes 

no 

contribution 

to  AFOQT 

7 

1 

3 

205 

3.13* 

OBAS  makes 

no 

contribution 

to  Reference  Battery 

6 

2 

2 

226 

2.71 

SVIB  makes 

no 

contribution 

to  Reference  Battery 

8 

2 

3 

233 

1.89 

*p  <  .05 


Tab la  5.  Psychomotor  Measures 


Pass/Fall 

FTD/Other 

(N  -  150) 

(N  -  150) 

Test  1 

“X4 

-.19* 

.27* 

Test  1 

-X5 

-.20* 

.29* 

Test  1 

-  *4 

-.14 

.27* 

Test  1 

-  y5 

-.20* 

.30* 

Test  2 

-  *4 

-.21* 

.27* 

Test  2 

x5 

-.20* 

.25* 

Test  2 

-.24* 

.26* 

Tent  2 

"  Y3 

-.18* 

.26* 

Test  2 

*  *4 

-.15 

.25* 

Test  2 

~  *5 

-.19* 

.28* 

Pass /Fail 

Paas/FTD 

(N  -  234) 

(N  -  201) 

Test  1 

-  X4  +  x5 

-.19* 

-.15* 

Test  1 

-  *4  ♦  ^5 

-.17* 

-.16* 

Tsat  2 

-  x4  +  x5 

-.16* 

-.16* 

Test  2 

-  y4  +  y5 

-.10 

-.12 

Test  2 

-  Z4  +  Z5 

-.12 

-.06 

*p  <  .05 
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Factors 

Pass /Fail 
(N  -  140) 

Pasa/FTD 
_  (N  -  117. 

I 

oo 

.03 

II 

.27* 

.37* 

III 

.00 

• 

o 

00 

IV 

.1^> 

.09 

V 

.20* 

.13 

VI 


04 


08 


.'able  8.  GAT  Factor*  -  Study  II 


Factors 

Pass /Fail 

<N  -  116) 

I 

.18 

.18 

II 

.15 

.16 

III 

.20* 

.25* 

IV 

.16 

.14 

V 

.06 

.16 

VI 

.20* 

.21* 

*p  <  .05 

Table  9.  Simple  GAT  Scores  -  Study  1 

OAT  Variable 

Pass /Fail 
(N  -  140) 

Pass/FTD 
(N  -  117) 

Average  Pitch  Angle  Deviation 

-.26* 

-.18 

Average  Bank  Angle  Deviation 

-.28* 

-.33* 

Average  Side  Slip  Deviation 

-.19* 

-.11 

Average  Heading  Deviation 

-.27* 

-.19* 

Average  Altitude  Deviation 

-.20* 

-.22* 

*p  <  .05 
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Tnble  10.  Simple  GAT  Scares  -  Study  II 


GAT  Variable 

Pass/Fail 
(N  “  116) . - 

Pass/FTD 
(N  -  99) 

Average  Pitch  Angle  Deviation 

-.28* 

-.37* 

Average  Bank  Angle  Devi&tion 

-.27* 

-.26* 

Average  Side  Slip 

-.15 

-.19* 

Average  Heading  Deviation 

-.09 

-.14 

Average  Altitude  Deviation 

-.18 

-.22* 

*p  <  .05 


Table  .11 .  Multiple  Correlations  of  Simple  GAT  Scores 


Criterion 

N 

No.  of 
Predictors 

R 

Pass /Fail 

140 

5 

.32 

Pas a /FID 


117 


5 


37 


Table  12.  Cross-Validated  Multiple  Correlations 


Noi.  of 

Study  I 

Study  II 

Criterion 

Predictors 

N 

R 

N 

R 

Pass/Fail 

Gat  Factors 

6 

120 

.42 

99 

CO 

o 

• 

• 

Pass/FTD 

GAT  Factors 

6 

98 

.48 

99 

.27 

Tabic  13.  Multiple  Correlations  of  Staple  GAT  Scores 
Corrected  for  Shrinkage 


Criterion 

N 

No,  of 

Predictor* 

R 

Pass /Fail 

140 

5 

.32 

.28 

Pass/FTD 


117 


5 


37 


33 


Table  14.  Joint  Contributions  -  Study  I 


Criterion 

Predictors 

No.  of 
Predictors 

N 

R 

Pass/Fail 

Psychoaotor  &  Gst  Factors 

11 

120 

.49 

Psychoaotor  &  SV1B 

8 

218 

.29 

Psychoaotor  &  OBAS 

7 

210 

.30 

Psychoaotor  &  Ref  Battery 

26 

224 

.32 

Pasa/FTD 

Psychoaotor  &  Gat  Factors 

11 

98 

.51 

Psychoaotor  &  SV1B 

8 

186 

.25 

Psychoaotor  &  OBAS 


7 


179 


24 


PSYCHOMETRIC  SUPPORT  FOR  ITEM  TYPES  USED  IK  A  HEW  WITTER 
EXAMINATION  FOR  ENTRY-LEVEL  FIREFIGHTERS 

Lois  C.  Northrop 

Tbit  paper  briefly  discusses  the  psychometric  background  of,  and  the 
itea  types  chosen  to  measure,  the  six  ability  constructs  repreaented  in 
a  new  entry-level  examination  for  use  by  the  D.C.  Fire  Department.  Docu¬ 
mentation  of  thia  nature  is  but  one  phase  in  the  process  of  developing  a 
new  measuring  instrument.  Extensive  task  and  duty  analyses,  the  linkage 
of  knowledges,  skills,  abilitias  and  other  worker  characteristics  to  these 
task  analyses  and  the  development  of  the  actual  test  plan  are  other  phases 
of  the  entire  process  which  will  not  be  dealt  with  here. 

The  six  constructs  discussed  here  are  the  six  which  were  determined  to 
be  the  most  critical  and  maaaurable  of  the  19  cognitive  abilities  studied. 

Since  human  cognitive  abilities  are  correlated,  three  to  five  of  these 
abilities  will  account  for  almost  all  of  ths  variance  iu  performance  that 
can  be  measured.  The  addition  of  sure  ability  measures  does  not  apprecia¬ 
bly  increase  the  predictive  validity  of  the  test,  i.e.,  the  ability  of  the 
test  to  predict  future  training  and  job  performance.  Therefore  the  six 
critical  abilities  selected  for  inclusion  in  the  entry-level  firefighter 
examination  are,  for  all  practical  purposes,  representative  of  all  the 
requirements  for  job  success. 

The  six  critical  abilitien  are  defined  as  follows: 

1.  The  ability  to  read  and  understand  written  materials  and 
instructions. 

2.  The  ability  to  understand  and  follow  spoken  instructions  or 
orders. 

3.  The  ability  to  use  simple  mathematical  formulas  or  equations. 

A.  The  abilit)  to  discover  general  rules  or  principles  from 
specific  situations  or  events  (learning  from  experience). 

5.  The  ability  to  recognise  or  identify  problems  vr  potential 
problems . 

6.  The  ability  to  make  judgments  and  decisions  when  information 
is  incomplete  or  conflicting. 

A  search  of  the  psychological  literature  provided  six  matching  ability 
constructs  which  have  been  consistently  identified  in  numerous  factor-analytic 
studies  over  the  past  30  or  40  years.  Many  of  these  studies  represent  the 
foundation  research  in  the  field  of  psychometrics.  There  are  differences, 
of  course,  tmont  the  studies  due  to  the  use  of  a  variety  of  subjects  varying 
in  age  as  well  as  in  ability  level.  The  stability  of  these  ability  constructs 
has  persisted  despite  the  diversity  of  conditions  in  the  factor-analytic 
studies  in  which  they  have  been  identified. 
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The  first  ability  -  tha  ability  to  raad  and  understand  written  wat¬ 
er  iala  and  instructions  -  is  represented  by  the  well-known  Verbal  or 
Verbal  Comprehension  factor  which ,  accordint  to  the  1976  manual  of 
factor-referenced  cognitive  testa  by  Ekatrow,  French  and  Barman  of  the 
Bducational  Testing  Service,  can  be  found  in  at  least  12S  published  stud¬ 
ies.  The  factor  is  a  very  stable  one.  It  repeatedly  appears  in  factor 
analytic  studies  when  a  variety  of  tests  in  the  verbal  medium  are  included 
in  the  test  battery  and  it  shows  great  resistance  to  breaking  up  into  sub- 
faetors. 

It  has  bean  suggested  that  aultiple  choice  vocabulary  tests  baaed 
on  synonyaa  are  the  beat  for  picking  up  individual  differences  in  verbal 
coaprehens ion.  However,  assessing  verbal  ability  with  isolated  vocabulary 
words  alone  tends  to  suggest  that  Verbal  Comprehension  is  only  a  subfac- 
■  tor  of  a  broader  factor  which  involves  reading  coaprehens ion,  verbal 

analogies,  matching  proverbs,  graanar  and  syntax,  leading  Coaprehens ion 
was  chosen  to  aeasure  verbal  ability  on  the  entry-level  D.  C.  firefighter 
written  exaaination  because  it  represents  a  aore  diversified  aspect  of 
the  ability  and  has  identified  a  Verbal  factor  along  with  a  number  of  other 
tests  of  a  verbal  nature,  (analogies,  proverbs,  graaaar).  In  addition, 
Reading  Comprehension  iteaa  seeasd  most  appropriate  for  assessing  what  was 
required  -  the  ability  to  read  and  understand  written  materials  and  instruc¬ 
tions. 

During  World  War  II,  the  Army  Air  Force  research  psychologists  used 
tests  of  Reading  Coaprahens ion  to  assess  verbal  ability  in  the  Air  Fore? 
Classification  Batteries.  Items  were  based  on  paragraphs  vhich  were 
"simple  descriptions  of  Air  Force  jobs,  the  training  involved  and  the 
individual  characteristics  required  for  success."  Several  questions  were 
asked  about  each  paragraph  basad  either  on  specific  information  contained 
in  it  or  on  inferences  which  could  be  drawn  from  the  material  presented. 

leading  coaprehens ion  iteaa  for  the  entry-level  D.C.  firefighter 
written  test  consist  cf  very  short  paragraphs  or  parts  of  paragraphs  taken 
directly  from  the  official  fire  department  manuals,  training  guides  and 
other  written  materials  used  in  the  work  of  the  entry-level  D.C.  firefigh¬ 
ter.  These  items  sample  directly  the  reading  stsuerial  which  entrylevel  fire- 
fighers  deal  with  both  during  training  and  subsequently  on  the  job.  Such 
items  test  their  ability  to  read,  understand  and  interpret  this  material. 
Only  material  of  an  introductory  or  very  general  nature  is  selected.  Some 
editing  may  have  been  done  to  eliminate  any  tachnical  wording,  which  an 
entry-level  applicant  would  not  be  expected  to  be  faailiar  with,  or  to 
clarify  the  statements.  No  changes  are  made  in  the  concept  or  meaning  of 
the  information  or  it  the  general  level  of  difficulty  of  the  passagee  to 
be  read  and  comprehended. 

The  second  ability  -  that  of  understanding  or  following  apoken  in- 
a true lions  or  orders  -  was  identified  with  an  Integration  factor  in  Army 
Air  Force  research  during  World  War  II.  The  factor  was  characterised  by 
the  ability  r.o  adapt  quickly  to  new  instructions  and  carry  them  out  suc¬ 
cessfully.  Items  in  the  test  presented  modifications  or  variations  in 
|  the  instructions  given  at  the  beginning  of  Che  te*t. 
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The  im  factor  vat  idantifiad  at  Attention  by  Witttnborn,  tha  assump- 
tioa  being  that  atteotioo  it  required  vhtn  following  a  ttriat  of  rapidly 
givtn  oral  directions*  none  of  which  it  difficult  by  itself*  but  whan 
aavaral  art  givan  at  onct,  contidtrablt  effort  it  naetttary  in  ordtr  to 
follow  all  simultaneously.  Ttttt  of  attention  vara  designed  to  that  per¬ 
formance  on  them  vat  not  dependant  on  intellectual  ability.  Tatka  pretented 
were  independent  of  knowledge  and  content*  and  involved  material  familiar 
to  anyone*  i.e.*  digits  or  lettcre  of  the  alphabet.  Scoret  depended  on  the 
ability  to  continually  auatain  mental  effort  or  Mconcentrate.N 

The  entry-level  firefighter  teat  for  thia  ability  providet  a  epecial 
action  eheet  which  in  addition  to  lettera  and  digite  contains  a  number  of 
timple  geometric  thapet.  Each  item  coneiata  of  a  act  of  oral  directiont 
such  at:  Make  a  croaa  in  the  firet  circle  and  alto  a  figure  1  in  the  third 
circle. N 

Mian  tiie  instructions  for  an  item  are  complete*  the  examineea  are  given 
from  5  to  10  seconds  (depending  on  the  complexity  of  the  instructions)  to 
respond  before  tha  instructions  for  the  next  item  are  presented. 

Instructions  for  each  item  in  a  test  of  thia  nature  can  be  given  con¬ 
siderable  variety  and  a  wide  range  of  intricacy  and  yet  no  tingle  instruc¬ 
tion  is  difficult  by  itself.  The  content  material  (digits,  letters  of  the 
alphabet*  simple  geometric  shapes)  is  familiar  to  all.  In  this  teat,  an 
examinee's  ability  to  coaprahend  and  adapt  quickly  to  changing  instructions 
is  vail  tested.  This  particular  teat  has  bean  in  use  in  the  U.S.  Civil 
Service  Commission  for  many  years  and  is  considered  to  be  one  of  the  best 
measures  available  for  measuring  the  ability  to  understand  and  follow  spoken 
instructions  or  orders. 

The  third  ability  -  the  ability  to  use  simple  mathematical  formulas 
or  equations  -  is  best  represented  by  the  clearly  defined  Number  factor 
found  in  over  80  studies.  This  factor  is  not  a  major  component  in  mathemati¬ 
cal  reasoning  or  higher  mathematical  skills.  It  is  simply  the  ability  to 
perform  basic  arithmetic  operations  with  speed  and  accuracy. 

The  best  reference  tests  for  the  Number  factor  are  those  with  the 
greatest  amount  of  number  handling*  i.e.,  testa  of  the  four  arithmetic 
operations  (addition*  subtraction,  multiplication  and  division).  Such  testa 
are  outstanding  in  purity  (i.e.,  they  do  not  load  on  other  factors)  and  in 
the  site  of  their  loadings  on  the  Number  factor. 

The  item  type  which  will  be  used  in  the  D.C.  entry-level  firefighter 
written  test  to  assess  the  ability  to  uaa  simple  mathematical  formulas  is 
very  specifically  job  related.  A  comprehensive  study  of  the  mathematical 
abilities  required  to  perform  the  entry-level  firefighter  job  in  D.C.  was 
undertaken  and  the  appropriate  level  of  difficulty  for  these  required 
abilities  vss  determined.  Simple  formulas  taken  directly  from  the  training 
manuals  are  presented  with  numerical  values  Vhieh  may  be  substituted  in  the 
formula  in  order  to  calculato  soma  quantity.  The  formulas*  of  course*  in¬ 
volve  the  four  erithmetic  operations  and  facility  with  the  operations 
determine  how  easily  and  quickly  an  applicant  can  arrive  at  the  correct 
answer.  Such  items,  of  course,  are  not  as  pure  a  measure  of  numerical 
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ability  as  simple  arithmetic  p  cob  leas,  tinea  the  additional  task  of  sub¬ 
stituting  a  number  for  tha  appropriata  lattar  in  a  foraula  it  required. 
However,  tha  foraat  of  the  propotad  itaat  it  aors  tuitabla  for  attatting 
tha  ability  actually  required  in  tha  entry-level  firafightar  job. 

Tha  ability  to  discover  ganaral  rulat  or  principlas  froa  t pacific 
tituationt  it  best  raprattntad  by  tha  Induction  factor  which  appeared 
at  ons  of  Thurstone's  16  priaary  aantal  abilitiat  and  hat  continued  to 
appear  in  tha  litaratura.  The  factor  fits  tha  concept  of  inductive  reason¬ 
ing  wall,  neatly  reasoning  froa  tha  specific  to  tha  general. 

Tha  typical  induction  test  presents  groups  of  words,  letters,  nuabers 
or  figures  and  tha  examinee  is  askad  to  discover  a  principle  or  rule  used 
in  their  makeup.  Using  the  discovered  principle  or  rule,  tha  axaainaa  can 
select  ^he  appropriata  addition  or  continuation  for  the  group. 

Tha  identification  of  Induction  as  a  factor  separata  froa  Deduction 
and/or  Ganaral  Seasoning  is  nut  as  straightforward  as  that  for  the  Verbal 
and  Muaber  factors.  There  exists  soaa  controversy  as  to  whether  individ¬ 
ual  aspects  of  the  reasoning  process  are  separately  amasurable.  neverthe¬ 
less  a  inductive  ability  throughout  tha  psychoaetric  litaratura.  One  of 
tha  sx>st  enduring  of  these  is  Lattar  Series.  Such  itaas  consist  of  a  series 
of  letters  which  follow  a  pattern.  The  axaainea's  task  is  to  discover  tha 
rule  or  principle  underlying  this  pattern  and  thus  find  tha  next  letter  in 
the  series.  Because  tha  Lattar  Series  itaa  type  is  wall  docuaented  in  tha 
psychosatric  litaratura  as  a  asasura  of  Induction  it  was  chossn  for  inclus¬ 
ion  in  tha  D.C.  Fire  Departcent  entry-level  written  exaaination.  This  itaa 
type  has  bean  used  in  other  written  examinations  developed  by  the  Civil 
Service  Coaaission  and  has  been  found  very  satisfactory  for  assessing  in¬ 
ductive  ability. 

The  ability  to  recognise  or  identify  probleas  or  potential  probleaa  is 
identified  with  a  factor  which  has  been  defined  as  Sansitivity  to  Froblaas. 
The  factor's  appearance  has  been  confined  to  Cuilftrd's  laboratory,  vhare 
it  has  been  isolated  in  a  number  of  different,  studies. 

The  item  type  chosen  for  the  problem  identification  itaas  on  the  entry- 
level  firefighter  written  exenination  prasents  a  plan  or  a  situation;  the 
examinee's  task  is  to  determine  the  chief  problem  with,  or  the  most  serious 
defect  in,  the  given  plan  or  situation.  The  item  type  is  complex  -  that  is, 
it  has  fsetor  loadings  on  factors  other  than  Sensitivity  to  Problems. 

Although  subject  matter  for  faulty  plans  or  futile  activities  is  drawn 
froa  common  situations  which  can  have  been  experienced  or  heard  about  by 
anyone,  the  task  of  finding  the  correct  answer  involves  analysis  and  eval¬ 
uation  of  a  situation  and  then  thought  beyond  the  given  situation  in  order 
to  generate  reasons  why  the  plan  or  action  is  faulty  and  will  not  lead  to 
the  desired  result.  For  these  reasons  the  itaa  type  was  considered  the  most 
appropriate  measure  of  the  ability  to  recognise  or  identify  probleas  or  po¬ 
tential  probleas. 
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The  last  of  the  fix  abilities  taatad  in  the  entry-level  firefighter 
examination  -  tha  ability  to  make  judgments  and  daciaions  whan  information 
ia  incomplete  or  conflicting  -  ia  boat  rapraaantad  by  tha  Jodganant  factor 
which  waa  first  isolated  in  several  studios,  most  of  than  carried  out  by 
tha  A ray  Air  Force  Aviation  Psychology  Besearch  program  Airing  World 
War  II.  Typically  judgment  itens  present  a  problem  requiring  a  bast  sol- 
ution.  Wot  all  of  tha  facts  bearing  on  the  solution  are  given  and  sone 
reasonable  assumptions  or  guesses  must  be  node  by  tha  axaniaaa  as  to  what 
•the  most  likely  of  several  possible  occurrences  night  be.  This  concept  of 
judgnsnt  is  obviously  coaplex. 

Any  suitable  item  type  to  assess  judgmsnt  will  have  variance  in  cost 
non  with  other  verbal  tests  but  the  crucial  task  in  measuring  judgment  is 
the  supplying  of  additional  data  (general  knowledge,  experience)  by  the 
eoapetitor  in  order  to  solve  the  problem. 

A  review  of  possible  item  types  suggested  the  Wechsler  (WAIS)  Compre¬ 
hension  Item  as  the  best  single  measure  of  judgment;  such  itesu  present 
"common  sense"  questions,  successful  answers  to  which  depend  upon  one's 
fund  of  practical  infomaticn  and  one's  evaluation  of  past  experience. 
Questions  are  of  a  type  which  adults  have  had  to  answer  for  themselves  or 
have  heard  discussed. 

Other  item  types  to  measure  judgmsnt  have  been  of  a  similar  nature 
and  were  used  in  the  Air  Force  research  program.  Itema  presented  problems 
of  a  "common  everyday"  type,  soam  having  to  do  with  work  planning,  the 
aolution  to  which  rested  not  upon  logical  reasoning  grounds  but  on  the 
ability  of  the  examinee  to  draw  upon  his  "common  sense,"  experience  and 
general  information  background. 

Acts  of  judgment  are  necessarily  a  part  of  many  intellectual  func¬ 
tions.  Judgmsnt  tests  will  therefore  be  factorially  complex  but  neverthe¬ 
less,  judgmsnt  is  an  important  and  assessable  aspect  of  human  intellect, 
necessary  to  practical  success  in  Mny  occupations  including  that  of 
firefighting. 

Judgment  items  for  ths  entry-level  D.C.  Firefighter  written  examina¬ 
tion  draw  upon  knowledge  or  experience  of  a  very  general,  aon-technical, 
and  current  nature,  the  level  of  irtiich  is  based  on  that  for  any  high 
school  graduate.  A  five-alternative  multible  choice  format  ia  used.  A 
short  stem  is  sought  in  order  to  reduce  the  reading  comprehension  variance 
of  the  question.  Distractors  may  be  based  oo  false,  irrelevant  cr  incom¬ 
plete  assumptions;  the  correct  alternative  ehould  have  a  documented  source 
and  ahould  not  be  predicated  on  mere  opinion.  Such  items  will  appropriately 
measure  the  required  ability. 
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A  METHODOLOGY  FOR  ASSESSING  PREFERENCES  IN  SPATIAL  ARRANGEMENTS 
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Gall  Metzger  and  Colleen  L.  Zelazny 
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INTRODUCTION 

The  design  of  living  spaces  has  a  strong  impact  on  the  "quality  of 
life"  perceived  by  those  who  live  and  work  within  those  spaces.  For  the 
Kfcvy  and  other  organizations  which  control  the  physical  surroundings  of 
their  personnel  for  prolonged  periods  of  time,  there  is  an  Increased 
awareness  of  the  importance  of  structuring  the  environment  to  meet  the 
needs  and  desires  of  its  occupants.  The  habitability  program  of  the  Navy 
has  looked  at  a  wide  variety  of  means  of  improving  the  perceived  quality 
of  living  spaces.  Such  a  program  is  recognized  as  being  desirable  for 
the  sustained  physical  and  psychological  well-being  of  those  who  are  in 
the  Navy.  With  the  advent  of  the  all-volunteer  force,  improved  life 
spaces  are  recognized  as  being  essential  in  order  to  both  attract  and 
retain  the  personnel  the  Navy  needs. 

Many  factors  have  been  examined  by  habitability  studies.  Some  of 
these  factors  have  Involved  light  levels,  color  schemes,  surface  treat- 
tents,  noise  levels,  temperature  variations,  and  ventilation  rates.  An 
area  which  has  not  received  much  attention  has  been  that  of  spatial 
arrangements.  The  studies  that  have  been  conducted  in  this  area  have 
primarily  involved  the  redesign  of  work  spaces.  In  those  analyses,  the 
arrangement  of  physical  space  to  meet  time  and  energy  requirements*  not 
personal  desires,  has  been  the  primary  objective.  For  the  life  spaces, 
no  effort  has  been  made  to  examine  optimal  spatial  arrangements. 

A  prime  requirement  for  developing  improved  habitability  programs  is 
the  determination  of  spatial  preferences.  A  second  reason  for  being  con¬ 
cerned  with  the  assessment  of  spatial  preferences  is  the  possibility  of 
differences  in  the  ways  in  which  men  and  women  perceive  and  utilize  space. 

Many  architects,  both  male  and  female,  profess  that  such  sexually- 
dlstinguishable  differences  exist.  For  the  Navy,  with  the  prospect  of 
sexually-integrated  crews,  differences  in  the  spatial  preferences  of  men 
and  women  would  impact  the  effectiveness  of  habitability  improvement  programs. 

A  study  of  the  attitudes  of  men  and  women  crew  members  aboard  the 
USS  SA’.cTUARY  led  to  the  rhservation  that  "women  aboard  ship  express  a 
greater  need  for  personal  privacy  than  do  the  men"  (Martin  et  al.,  1973, 
p.  62).  Whether  this  expression  was  simply  a  verbal  phenomenon,  or  whether 
It  reflected  some  stronger,  behaviorally-based  need,  could  not  be  ascertained 
at  the  time  of  that  earlier  study. 

The  primary  objective  of  the  research  reported  in  this  paper  was  to 
develop  tools  and  techniques  which  could  be  used  for  determining  preferences 
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in  spatial  arrangements.  A  secondary  objective  was  to  administer  these 
tests  to  a  limited  sample  of  men  and  women  to  determine  whether  conspicuous 
differences  in  spatial  preferences  existed  between  the  two  sexes. 

METHOD  AND  RESULTS 

After  examining  a  number  of  means  of  determining  spatial  preferences, 
three  sets  of  assessment  tools  were  developed.  These  were  a  layout  diagram, 
a  figure  selection  test,  and  a  questionnaire.  For  this  exploratory  effort, 
the  life  space  selected  as  the  basis  for  the  study  was  a  dormitory  room 
shared  with  one  other  person  of  the  same  sex. 

The  first  tool,  the  two-dimensional  reduced-scale  layout  diagram, 
permitted  respondents  to  move  cutouts  depicting  two  twin  beds  and  two 
wardrobe/dresser  units  in  a  9  x  12  foot  rocm.  The  second  assessment 
tool  presented  sets  of  four  possible  room  arrangements  (derived  for  the 
same  furniture  and  space  as  for  the  first  test)  which  the  subjects  were 
to  rank  in  order  of  preferences.  The  third  tool  was  a  questionnaire 
obtaining  responses  to  verbal  statements  about  preferences  in  layout 
design  as  well  as  to  the  need  for  privacy. 

LAYOUT  DIAGRAM 

A  preliminary  experiment  was  conducted  with  the  layout  diagram  in 
which  it  wss  given  to  S3  male  subjects  ranging  in  age  from  the  late  teens 
to  early  twenties.  With  only  four  items  of  furniture  and  with  a  relatively 
small  space,  it  would  seem  that  the  varieties  of  responses  would  be  limited. 
In  fact,  however,  a  considerable  number  of  different  responses  was  possible. 
Considering  only  the  positions  of  the  two  beds  against  the  walls,  there  were 
ten  basic  possible  configurations.  If  one  or  both  of  the  beds  were  moved 
out.  from  the  walls,  more  variations  were  possible  (but  these  were  counted 
as  variants  of  the  basic  layouts).  The  introduction  of  the  two  wardrobe 
units  greatly  increased  the  masher  of  designs.  A  total  of  33  basic  varia¬ 
tions  for  beds  and  wardrobes  was  found  to  exist. 

Considerable  Individual  differences  in  preferences  for  spatial 
arrangements  were  found  in  this  preliminary  experiment.  Or  the  ten 
possible  basic  configurations  based  on  bed  positions  alone,  eight 
were  selected  by  the  respondents.  While  40  respondents  chose  basic 
configurations  with  both  beds  touching  the  walls  at  comers  of  the 
room,  13  respondents  chose  to  have  at  least  one  of  the  beds  free-standing. 

Of.  the  33  possible  configurations  based  on  bed  and  wardrobe  positions, 

18  were  selected  by  the  respondents. 

Despite  the  diversity  of  the  responses,  some  general  findings  could 
be  promulgated.  Fully  70  percent  of  the  respondents  arranged  their 
beds  so  that  they  were  parallel  with  their  rooamate's.  More  than 
77  percent  positioned  their  beds  so  that  they  were  in  comers  of  the 
room.  Slightly  over  62  percent  could  be  acccaodated  by  a  selection 
from  three  basic  designs. 
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Following  this  pilot  study,  the  ltyout  diagram  test  was  modified  and 
presented,  in  conjunction  with  the  figure  selection  and  questionnaire 
tests,  to  a  sample  of  ten  males  and  ten  females.  This  limited  sample 
group  also  differed  from  the  first  in  that  the  respondents  were  primarily 
in  their  20's  and  30's. 

Considerable  diversity  in  responses  was  found.  Of  the  ten  configure* 
tions  based  on  bed  position  alone,  six  were  selected.  Of  the  33  configure* 
-.ions  based  on  bed  aud  wardrobe  positions,  11  were  selected. 

Nevertheless,  some  uniformity  of  response  was  observed.  One  parti¬ 
cular  bed  position  configuration  was  selected  by  40  percent  of  the 
respondents.  In  fact,  one  specific  bud  and  wardrobe  configuration 
was  selected  by  30  percent.  Beds  were  positioned  against  the  walls, 
in  corners  of  the  room,  by  90  percent.  The  preference  for  locating 
the  beds  parallel  to  one  another,  observed  in  the  pilot  study,  was  not 
upheld  in  this  one;  the  selection  of  parallel  versus  perpendicular 
orientations  was  made  by  exactly  50  percent  of  the  respondents. 

These  differences  in  responses  between  the  pilot  study  and  this 
later  experiment  may  have  been  due  to  the  increased  age,  and,  hence, 
maturity  and  accumulated  experience  of  the  respondents  in  living  in 
dormitory  conditions  or  in  visualising  space  from  two-dimensional 
cutouts. 

The  moat  surprising  finding  was  the  reduction  in  diversity  of 
responses  and  the  concentration  of  such  a  large  percentage  of  respondents 
on  a  single  configuration. 

SCORING  OF  LAYOUT  DATA.  The  tabulation  of  results  from  the  layout 
data  proved  unexpectedly  difficult.  The  preliminary  studies  had  indicated 
suae  of  the  types  of  data  which  could  be  collected.  The  room  designs  gener¬ 
ated  certain  impressions  which  were  difficult  to  define.  :  Efforts  to  stipu¬ 
late  definitions  almost  Inevitably  ran  afoul  of  exceptions.  Not*  * we re 
reliable  instruments  available  to  make  all  the  measurements.  A  planimeter 
for  measuring  surface  area  gave  discrepant  results  each  time  it  was  used. 
Nevertheless,  some  16  different  items  were  developed  which  could  be  con¬ 
sistently  measured.  These  are  listed  in  Table  1  and  a  diagram  of  the  room 
and-  Its  furnishing?.  is  dsplcted  in  Figure  1.  The  items  on  which  data  were 
taken  included  factors  relating  to  the  orientation  of  the  beds,  the  distances 
from  las  beds  to  other  objects,  the  visibility  of  other  objects  from  the 
bed,  and  the  personal  area  defined  by  each  bed  and  wardrobe  combination. 

The  analysis  of  the  dats  obtained  from  the  men  and  women  Indicated 
remarkable  overall  similarities .  Table  1  presents  the  averages  of  the 
responses  obtained  fros  each  sample.  The  greatest  difference  between  the 
men  and  women  appeared  to  Ida  In  the  division -of  the  room  into  personal 
areas.  Whereas  the  men  generally  divided  the  area  equally,  the  women 
tended  to  divide  the  area  unequally  (70  percent  versus  30  percent). 
Nevertheless,  the  unequal  distribution  of  area  by  both  the  men  aria  the 
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Table  1.  Layout  Diagram  Results 


CATEGORY  AND  ITEM 

MALES 

FEMALES 

Bed  Orientation: 

it 

1. 

Parallel 

40% 

60% 

2. 

Perpendicular 

60% 

40% 

3. 

Converging  view 

100% 

100% 

4. 

Diverging  view 

0% 

0% 

Bed  Distances: 

it  it 

5. 

Pillow  separation 

10.8' 

10.0’ 

6. 

Bed  separation 

4.0’ 

3.9' 

7. 

Respondent  to  door 

8.9' 

9.9' 

8. 

Roommate  to  door 

8.7’ 

8.0' 

9. 

Respondent  farther  from  door 

50% 

80% 

Bed  Visibility: 

10. 

Respondent  to  roommate 

80% 

80% 

11. 

Respondent  to  door 

80% 

70% 

12. 

Roommate  to  door 

70% 

80% 

Personal 

Area: 

13. 

Respondent's  frontage  (bed  to  ward.) 

6.)’ 

5.4’ 

14. 

Roommate's  frontage  (bed  to  ward.) 

5.6’ 

5.8' 

15. 

Equal  frontages  for  both  roommates 

60% 

30% 

16. 

Equal  areas  for  both  roommates 

70% 

30% 

* 

Percentages  Indicate  respondents 

arranging  their 

room  layout  to  include  the  listed  feature. 
Measurements  indicate  the  average  distance  in  feet. 
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Figure  !♦  Roow  Layout  Diagram 
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women  was  not  dore  for  personal  gain;  in  both  cases,  the  roommate  benefited 
mere  frequently  than  the  respondent. 

FIGURE  SELECTION  TEST 

The  figure  selection  test  was  designed  to  ensure  that  the  respondents 
would  consider  the  major  alternatives  possible  with  the  room  layout  design. 

It  also  attempted  to  separate  out  the  component  decision  points  involved 
in  arriving  at  a  satisfactory  design. 

The  test  consisted  of  sets  of  four  possible  configurations.  Within 
each  set,  the  respondent  was  required  to  rank  order  his  preferences.  The 
sets  were  designed  so  that,  inasmuch  as  was  possible,  one  primary  factor 
was  being  varied  while  the  others  were  held  constant. 

The  first  two  sets  consisted  of  the  room  with  two  beds  in  different 
arrangements.  Tho  locations  of  the  pillows  on  the  beds  were  not  Indicated 
and  the  wardrobes  were  not  included  in  these  sets. 

The  first  set  offered  the  choice  between  beds  which  were  parallel  ard 
beds  which  were  perpendicular.  Both  the  males  and  females  were  equally 
divided  between  parallel  and  perpendicular  arrangements. 

The  second  set  offered  the  choice  between  beds  which  were  positioned 
In  different  corners  of  the  room  at  Increasing  distances  of  separation. 

Both  the  scales  ano  females  prefered  the  greatest  separation  (with  beds  in 
opposite  corners)  and  avoided  the  least  separation  (parallel  beds  occupying 
adjacent  corners  on  the  shorter  wall). 

The  third  and  fourth  sets  consisted  of  drawings  of  the  roo«  with  two 
beds  on  which  the  locations  of  the  pillows  were  indicated  (the  wardrobes 
were  not  included  in  these  sets). 

The  third  set  offered  the  choice,  with  beds  arranged  in  parallel  at 
opposite  corners  of  the  room,  of  having  the  heads  of  the  occupants  facing 
in  converging,  parallel,  ot  diverging  directions.  Near  universal  agree¬ 
ment  was  obtained  favoring  the  converging  arrangement.  This  preference 
was  followed,  in  descending  order,  by  the  parallel  arrangement  with  both 
roommates  looking  into  the  ball;  by  the  same  orrangeawnt  with  both  looking 
at  the  outside  wall;  and  by  the  diverging  srrangoswnt. 

The  fourth  set  offered  the  choice,  with  the  beds  in  perpendicular 
arrangements  at  adjacent,  and  opposite  corners,  of  having  the  pillow  posi¬ 
tions  at  varying  distances  of  separation.  Again,  near  universal  agreement 
was  obtained  favoring  the  maximum  pillow  separation,  with  the  heads  of  the 
beds  In  opposite  corners  of  the  room.  The  rank  ordering  of  this  and  of 
the  remaining  preferences  indicated  that  two  factors  were  at  work:  One 
factor  Involved  keeping  the  heads  of  the  l*;ds  against  walls  (i.e.,  In  corner* 
of  the  room);  the  other  involved  attaining  maximum  separation  within  that 
constraint. 
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The  fifth  and  sixth  sets  of  drawings  complicated  the  selection  process 
still  further  by  the  addition  ot  the  wardrobe  units. 

The  fifth  set  offered  the  choice,  with  the  beds  in  perpendicular 
arrangement  at  opposite  corners,  of  locating  the  wardrobes  at  various 
positions  with  regard  to  the  heads  of  the  beds.  Both  males  and  females 
prefered  to  have  the  wardrobe  positioned  so  that  there  was  a  maximum 
amount  of  open  area.  They  least  prefered  selections  which  closed  off 
an  area  of  the  room  from  visibility  from  other  areas. 

The  sixth  set  identified  one  of  the  beds  as  that  of  the  respondent 
and  offered  the  choice  of  having  it  be  visible  or  hidden  from  view  from 
the  hall.  The  responses  of  the  males  and  females  showed  little  consistency 
with  this  set.  In  general,  a  slight  preference  appeared  to  exist  for  the 
more  open  floor  plana,  but  the  results  were  so  highly  individualised  that 
no  general  conclusions  can  be  supported  with  any  degree  of  confidence. 

The  figure  selection  test  revealed  several  findings  that  were  also 
reflected  in  the  room  layout  diagram.  Respondents  had  little  preference 
between  parallel  and  perpendicular  bed  arrangements.  They  prefer  to  maxi¬ 
mise  the  separation  between  beds.  The  heads  of  the  beds  must  be  positioned 
against  a  wall  and  respondents  will  accept  a  leaser  separation  between  beds 
in  order  to  have  the  heads  in  this  position.  The  positions  of  the  heads 
should  be  such  that  roommates  are  facing  In  a  converging  direction;  parallel 
viewing  arrangements  are  not  prefered,  but,  if  they  must  be  selected,  the 
respondents  would  require  that  they  be  oriented  to  afford  a  view  of  the 
entry  to  the  room.  A  preference  for  maximizing  the  amount  of  open  area  in 
the  room  appeared  present  in  most  responses.  Designs  with  sections  of  the 
room  visually  closed-off  from  the  other  were  not  regarded  favorably.  When 
respondents  were  given  the  opportunity  to  shield  their  beds  from  view  from 
the  hall,  they  showed  no  preference  for  selecting  those  arrangements. 

QUESTIONNAIRE 

The  questionnnire  was  designed  to  tap  the  thought  processes  which 
might  be  underlying  me  room  layout  and  figure  selection  test  behavior. 
Because  it  was  a  written  test  which  required  the  verbalization  of  a  response 
(as  opposed  to  positioning  the  cutouts  or  selecting  a  design),  it  was  anti¬ 
cipated  that  the  respondents  might  exhibit  more  social  constraints  in  deve¬ 
loping  their  answers. 

The  questionnaire  asked  specific  questions  about  the  design  process 
and  some  general  ones  about  the  need  for  privacy. 

DESIGN  PROCESS.  One  portion  of  the  questionnaire  consisted  of  s 
series  of  items  related  to  the  various  identifiable  aspects  of  the  physical/ 
psychological  arrangement  of  the  rooms.  These  questions  were  related  to 
the  discernable  features  which  were  being  measured  on  the  layout  diagram. 

The  responses  to  the  questionnaire  were  compared  with  the  room  layouts 
designed  by  each  respondent.  This  was  done  in  order  to  determine  whether 
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Che  verbalized  expression  was  realized  in  a  physical  sanifestation;  in 
ocher  words,  Co  find  ouC  whecher  Che  questionnaire  and  the  layout  would 
indicate  the  saise  response.  (This  had  been  an  issue  because  annecdotal 
evidence  had  suggested  th«t  the  verbal  behavior  m»y  be  quite  different 
from  the  actual  physical  behavior.) 

In  general,  the  responses  of  the  subjects  tracked  their  actual  layout 
diagrams  fairly  closely.  In  some  cases,  tradeoffs  were  necessitated  by 
the  room  design  and  not  all  idealized  responses  could  be  accomodated.  In 
most  instances,  however,  the  verbal  expression  about  design  characteristics 
was  paralleled  in  the  actual  behavior. 

The  one  major  exception  occured  with  the  responses  to  the  question  of 
whether  the  respondent  would  consider  it  desirable  for  both  roommates  to 
have  equal  amounts  of  personal  area.  All  respondents,  male  and  female, 
replied  affirmatively,  but,  as  was  noted  earlier,  30  percent  of  the  men 
and  70  percent  of  the  women  did  not  achieve  that  goal  iu  their  layout 
diagramr . 

PRIVACY.  One  portion  of  the  questionnaire  was  directed  at  determining 
the  sensitivity  of  the  respondents  to  the  need  for  privacy.  This  was  assessed 
in  two  ways.  A  direct  assessment  was  obtained  by  «,sking  the  respondent  to 
Indicate,  for  each  of  nine  activities  performed  in  the  room,  his  feelings 
of  a  need  for  privacy  on  a  5-point  scale.  An  indirect  assessment  was  ob¬ 
tained  by  asking,  for  the  same  activities,  whether  or  not  the  door  would 
be  open  or  closed.  Responses  to  these  two  questions  (presented  in  Table  2) 
were  used  to  determine  an  overall  level  of  sensitivity  to  the  need  for 
privacy  for  each  respondent. 

The  responses  of  the  subjects  to  the  two  types  of  questions  tracked 
one  another  closely,  supporting  the  belief  that  both  questions  were  directed 
at  the  same  basic  area.  There  was  a  confounding  of  results  by  some  additional 
factors,  however.  For  example,  with  regard  to  listening  to  a  radio,  a  number 
of  subjects  indicated  that  they  would  close  the  door  even  though  they  ex¬ 
pressed  no  particular  need  for  privacy  (they  would  close  the  door  out  of 
consideration  for  their  neighbors). 

In  general,  the  females  Indicated  a  greater  feeling  of  a  need  for 
privacy  than  the  males.  (The  one  exception  to  this  generalization  was 
with  regard  to  writing  where  a  greater  number  of  males  than  females  would 
close  the  door.)  There  findings  were  significant  at  the  0.025  level. 

The  privacy  rating  was  used  to  divide  the  sample  of  men  and  of  women 
into  additional  subgroups  of  high  and  low  scores.  Thu  data  obtained  with 
the  layout  diagram,  the  figure  selection  test,  and  the  design  process  por¬ 
tion  of  the  questionnaire  were  then  examined  to  determine  whether  there 
were  any  significant  differences  in  the  responses  of  those  experiencing 
different  levels  of  need  for  privacy.  Despite  the  numerous  measurements 
and  comparisons,  no  conspicuous  and  consistent  results  were  apparent. 
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Table  7..  Privacy  Questionnaire  Results 


PRIVACY*  DOOR  POSITION** 


ACTIVITY 

Male 

Female 

Difference 

Male 

Female 

Difference 

Sleeping 

4.3 

4.7 

0.4 

90 

100 

10 

Napping 

3.6 

4.5 

0.9 

80 

100 

20 

Reading 

3.3 

3.6 

0.3 

80 

80 

0 

Dressing 

3.3 

4.5 

1.2 

80 

90 

10 

Grooming 

2.4 

3.9 

1.5 

60 

90 

30 

Writing 

2.8 

3.3 

0.5 

80 

70 

-10 

Talking 

2.3 

3.3 

1.0 

40 

60 

20 

Listening 

1.6 

2.2 

0.6 

50 

50 

0 

Relaxing 

2.0 

2.5 

0.5 

20 

40 

20 

'  Rated  on  a  scale  from  5  to  1  of  decreasing  need  for  privacy. 

Scored  as  percentage  of  respondents  who  would  close  the  door 
j  while  engaged  in  the  listed  activity. 


CONCLUSIONS 

DESIGN  CHARACTERISTICS 

Hie  results  of  this  study  Indicate  that,  although  there  are  a  great 
many  Individual  differences  in  terns  of  preferences  for  the  way  in  which 
n  room  will  be  arranged,  there  are  sone  basic  design  characteristics  which 
will  be  nore  appealing  to  the  greater  nuaber  of  people.  Such  factors  as 
having  the  beds  facing  in  convergent  directions,  positioning  beds  so  that 
the  heads  are  against  the  wall,  maximizing  the  separation  between  the  heads 
of  the  beds,  etc.,  are  price  considerations.  Other  factors  such  as  orienting 
{  the  beds  in  perpendicular  or  parallel  fashion  are  not  especially  important. 

1  A  major  finding  of  this  study  is  that,  although  there  are  significant 

«  differences  in  the  expression  by  males  and  by  females  of  the  need  for 

;  privacy,  these  differences  are  not  reflected  in  sexually-differentiated 

r  preferences  for  spatial  arrangements.  Both  the  males  and  feamles  demon- 

|  st rated  substantially  similar  responses  to  the  layout  diagram  and  to  the 

I  figure  selection  test. 

f 

!If  substantiated  by  further  research  with  additional  subjects,  and  if 

upheld  for  other  life  spaces  which  would  need  to  be  examined,  these  findings 
would  allay  the  concerns  of  habitability  engineers  about  designing  life 
spaces  which  may  prove  equally  satisfactory  to  male  and  to  female  personnel. 
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ASSESSMENT  TOOLS 

This  study  has  alto  indicated  that  a  battery  of  nsaessment  tools  my 
be  used  to  determine  spatial  preft.ences  for  life  spaces.  The  results  ob- 
rained  with  such  varied  tools  as  the  free-reaponse,  layout  diagram;  the 
forced-choice,  figure  selection  test;  and  the  verbally-oriented  question¬ 
naire,  all  led  to  similar  conclusions. 

The  layout  diagraa  permits  smximum  variation  in  response,  yet  is  limited 
because  the  subject  aay  be  constrained  by  his  ability  to  conceptualise  alter¬ 
native  patterns.  The  figure  selection  test  permits  particular  design  factors 
to  be  isolated  by  holding  certain  factors  constant  while  manipulating  others. 
The  questionnaire  permits  an  assessaent  to  be  made  of  internal  factors,  but 
care  must  be  taken  to  distinguish  between  those  questions  pertaining  to 
tangible,  physical  aspects  of  design  and  those  pertaining  to  the  emotional, 
psychological  aspects  or  feelings. 

Each  test  has  its  own  particular  merits.  A  battery  of  tests  such  as 
was  developed  for  this  study  could  be  used  to  determine  the  spatial  prefer¬ 
ences  of  aubjects  for  the  improved  habitability  design  of  their  life  spaces. 
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OBVERSE  FACTOR  ANALYSIS  TO  ENHANCE  THE  VALIDITY 
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Fourteen  years  ago  I  was  privileged  r.o  he  the  keynote  speaker 
at  the  6th  annual  MTA  conference  held  at  the  Coast  Guard  Institute 
in  Groton,  Connecticut,  Entitled  "Testing  is  Serious  Business"* 

My  address  contained  several  examples  of  my  discontent  with  certain 
test  methodology  in  vogue  ch  the  time.  One  of  these  problem  areas 
I  mentioned  was  the  relative  usefulness  of  configural  versus  linear 
or  sumsutive  models  in  scoring  our  tests  and  in  deriving  our  pre¬ 
dictive  formulae.  This  paper  then  becomes  a  modest  attempt  to  sub¬ 
stitute  action  for  talk  by  providing  «  demonstration  of  what  happens 
to  the  validity  of  selection  tests  whose  scores  were  derived  from 
weighted  patterns  of  responses  rather  than  weighted  or  unweighted 
sums  of  responses,  the  latter  being  the  commonly  accepted  scoring 
technique  used  by  most  test  practioners. 

The  need  for  a  new  approach  to  submariner  selection  was  based 
upon  several  observations  regarding  certain  aspects  of  the  sub¬ 
mariner  selection  data  collected  over  the  past  two  decades.  In  the 
first  place,  involving  a  variety  of  aptitude,  interest  and  personality 
tests,  our  linearly-derived,  validity  coefficients,  single  or  multiple, 
rarely  exceed  0.40  with  our  available  training  criteria.  Secondly, 
in  an  era  of  shortages  of  quality  submariner  volunteers  when  false 
positive  selection  errors  (reject  a  good  candidate)  are  very  serious, 
the  observation  that  there  are  almost  an  infinite  number  of  trait 
configurations  descriptive  of  the  30-402  of  the  submariner  candidates 
who  fall  at  various  phases  of  their  career.  Aptitude  deficiencies, 
deficient  or  inappropriate  motivation,  attitudes  or  interests,  and 
emotional  instability  in  an  endless  variety  of  pat* eras  characterize 
most  submariner  "attrites"  (Weybrew,  1963).  Tt  seems  much  too  simple, 
certainly  unrealistic,  to  expect  a  linear,  summative  equation  made 
up  cf  one  or  more  aptitude  scores  for  example,  to  yield  substantive 
predictive  indices  with  respect  to  these  attrition  criteria, 

Thus,  stated  more  precisely,  the  objectives  of  this  study  were 
twofold:  (1)  Empirically,  to  identify  personality  types  character¬ 
izing  "attrites"  «nd  successes,  then  (2)  to  investigate  the  differences 


*The  opinions  or  assertions  where  they  appear  in  this  paper  are  those 
of  the  author  and  are  not  to  be  construed  as  the  official  views  of  the 
U.S.  Navy  Medical  Department. 
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in  the  predictive  validity  of  standard  selection  tests  for  "good" 
and  "poor"  fits  to  these  "types".  There  was  one  assumption  under¬ 
lying  this  study,  viz.,  that  variables  interrelate  differently 
within  different  classes  or  types  of  individuals.  It  follows, 
therefore,  that  only  when  the  test  battery  contains  a  single  common 
factor  and  the  criterion  population  is  homogeneous  in  the  sense 
that  the  interaction  matrix  of  the  variables  (items  of  a  questionnaire, 
in  this  particular  study)  are  equivalent  for  all  criterion  sub-groups, 
will  a  given  prediction  formula  be  maximally  effective  for  the  total 
criterion  population.  It  was  therefore  the  burden  of  this  study  to 
show  that  obverse  factor  analysis  is  one  possible  technique  for 
grouping  the  total  criterion  population  into  homogeneous  sub-groups 
within  which  differential  predictive  validity  may  be  found. 

METHOD  AND  PROCEDURE 

As  alluded  to  in  the  Introductory  comments  the  method  of  choice 
for  deriving  personality  "typen"  was  obverse  or  inverse  factor 
analysis  (Cat tell,  1952),  a  factorization  technique  by  which  hypo¬ 
thetical  factors  are  derived  from  between-person  covariance  or 
correlation  matrices.  Found  early  in  the  history  of  submarine 
psychology  to  be  useful  in  Identifying  trait  patterns  characterizing 
successful  submariners  (Weybraw,  1953),  this  factor  analytical  technique 
(called  Q-technique  if  Q-sorts  are  involved)  yields  factors  whose 
structure  is  defined  by  factor  loading  on  persons  r.nher  than  loadings 
on  tests  or  measures  produced  by  the  more  common  betwcen-test  or 
R-technique  of  factor  analysis. 

Subjects.  The  sample  upon  which  the  typology  derived  by  obverse 
factor  analysis  was  based  consisted  of  twenty  enlisted  candidates  for 
submarine  training.  This  sample  was  drawn  randomly  from  a  population 
of  800,  15Z  (3  Ss)  from  high-achieving  successes  (stanine  9),  15X  from 
the  stanine  "l"  success  group  and  the  remainder  (14  Ss)  from  the 
attrition  segment.  The  validation  samples  consisted  of  two  independent 
Submarine  School  classes,  N*»277. 

Data  collection  and  analytical  techniques.  A  psychiatric  screening 
questionnaire,  ‘'custom-tailored"  for  submariner  candidates,  the  Personal 
Inventory  Barometer  (PIB),  was  used  to  identify  the  adjustment  types 
(Weybrew  and  Youniss,  1957).  Validated  mainly  to  identify  adjustment 
failures  at  the  training  level,  the  52  keyed  items  of  the  PIB  yielded 
high  concurrent  validity  with  such  well-known  personality  tests  as 
Scale  7,  Psychasthenia,  MMPI  (0.73,  N«250)  and  with  the  Guilford-Martin 
(GAMIN)  -0.66  and  -0.73  with  the  Absence  of  Inferiority  and  the  Nervous¬ 
ness  factors  respectively.  The  PIB  employs  a  10-point  multi-category 
response  format  extending  from  "0",  "Not  at  all  like  me"  to  "9", 

"Exactly  like  as"  with  3  intervening  anchor-categories. 


The  analytical  procedure  consisted  of  the  following  steps: 

(1)  The  52-item  array  for  each  of  the  20  Ss  was  first  converted 
to  ipsative  form,  that  is  deviation  scores  (stanines),  were  calcu¬ 
lated  from  the  means  and  SD's  for  each  of  the  52-item  response  arrays 
for  each  of  the  20  Ss;  (2)  Thur stone's  Group  Centroid  Method  of 
factor  analysis  was  applied  to  the  20x20  inter-psrson  matrix  of 
Pearson  Product  Moment  coefficients  and  (3)  the  resulting  6x20  factor 
matrix  rotated  orthogonally  by  a  simple  geometric  technique  (Fruchter, 
1954).  The  rotational  criterion  was  simple  structure.  The  "person- 
vectors”  were  coded  but  not  identified  during  the  rotational  pro¬ 
cedure  so  as  to  provide  a  "blind"  control  on  the  person  (the  author) 
carrying  out  the  calculations. 

RESULTS 


Delineation  of  the  Obverse  Factor  Types.  Thirty-six  per  cent  of 
r.he  between-person  correlation  coefficients  were  significant  at  the 
51  confidence  level.  Nevertheless,  the  correlations  were  low,  with  a 
swan  coefficient  of  0.23  and  a  S.u.  of  0.12.  For  the  most  part,  the 
residual  matrix  was  exhausted  after  the  extraction  of  six  reference 
centroids  although  Tucker's  criterion  was  met  after  the  fifth  re¬ 
siduals  were  computed.  The  mean  and  S.D.  of  the  residuals  were  .07 
and  .05  respectively.  These  six  reference  axes  were  then  rotated 
orthogonally  until  simple  structure  was  approximated.  The  factor 
matrix  is  presented  in  Table  1.  (Table  1  on  following  page) 


TABLE  1 


Orthogonally  Rotated  Factor  Loadings  of 
Twenty  Enlisted  Men* 


Subject  Obverse  Factors 


Code 

Nus&er 

1 

II 

III 

IV 

V 

VI 

h2 

1 

.40 

.35 

.36 

.38 

-.09 

-.02 

.56 

2 

.23 

.58 

.06 

-.04 

.09 

-.13 

.42 

3 

.31 

.59 

-.24 

.15 

.23 

.06 

.58 

4 

-.02 

.06 

.34 

.11 

.72 

.00 

.65 

5 

.37 

.20 

-.07 

.28 

.08 

.00 

.27 

6 

.42 

-.24 

.54 

-.19 

.43 

-.15 

.77 

7 

.65 

-.27 

-.04 

.34 

.18 

.00 

.64 

8 

.29 

-.25 

.10 

.16 

.11 

.31 

.29 

9 

.47 

-.26 

.15 

.08 

.25 

.03 

.38 

10 

.56 

.23 

-.05 

.18 

-.28 

-.05 

.48 

11 

.66 

.14 

.17 

.46 

-.14 

-.09 

.72 

12 

.12 

.01 

.16 

.06 

.84 

.00 

.75 

13 

.08 

.00 

.12 

.82 

.02 

.00 

.69 

14 

.13 

-.06 

.19 

.13 

.70 

.17 

.59 

15 

.23 

.12 

.16 

-.14 

.30 

.00 

.20 

16 

.62 

-.10 

-.02 

.15 

.28 

-.21 

.54 

17 

>32 

.30 

.15 

-.22 

.23 

.00 

.48 

18 

.55 

-.14 

.02 

-.01 

.09 

,10 

.34 

19 

.17 

-.12 

.87 

-.09 

.36 

-.02 

.94 

20 

.35 

.35 

.29 

.21 

.37 

-.02 

.51 

*Factor  loadings  which  are  underlined  indicate  the  persons  used  to  identify 
the  factor. 
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An  Inspection  of  the  underlined  leadings  in  Table  1  indicates 
that  all  of  the  persons  except  those  with  code  numbers  1,  5,  8,  9, 

15  and  20  were  used  to  identify  some  one  or  other  of  the  factors. 
Judging  from  the  low  coomunallties  (h^  in  Table  1),  it  is  apparent 
that  persons  with  code  numbers  of  5,  8,  and  15  had  item  profiles 
unique  to  this  group.  The  remaining  three  out  of  the  six  persons 
were  rejected  as  Identifying  persons  largely  on  the  basis  that  the 
vectors  representing  their  item  profiles  failed  to  rotate  into  any 
of  the  factor  hyperplanes. 

The  persons  identifying  each  factor  in  the  rotated  solution  in 
Table  1  appeared  to  cluster  reasonably  well  with  respect  to  the 
criterion  groupings  used  to  select  the  20  subjects  for  the  obverse 
factor  analysis.  Thus,  Factor  I  had  the  highest  loadings  by  four 
"academic”  failures  and  two  "temperamental"  failures.  Factor  II, 
on  the  other  hand  is  a  doublet,  loaded  by  two  men  who  graduated  in 
the  upper  stanine.  Factor  III  is  also  a  doublet  loaded  by  one  low 
achiever  (Stanine  1)  graduate  and  one  "academic"  failure.  Factor 
IV  contains  two  high-loading  "academic"  failures;  however,  one  of 
them  (code  number  11)  also  loads  Factor  I,  indicating  an  overlap 
between  the  two  factors.  Factor  V  appears  to  be  a  triplet  loaded 
by  two  "academic"  failures  and  one  "stanine  1"  graduate.  Finally, 
Factor  VI  appears  to  be  a  residual  factor  since  there  are  no  high- 
loading  persons  identifiable  in  the  present  solution.  Thus,  in  sum, 
all  but  one  of  the  five  obverse  factors  were  identified  by  submariners 
who  failed  or  were  low-achievers  in  basic  training.  Factor  II,  on 
the  other  hand,  is  identified  by  high-achieving  Submarine  School 
graduates. 

While  not  directly  pertinent  to  the  methodological  emphasis  of 
this  paper,  it  should  be  mentioned  for  the  more  clinically-oriented 
psychologists,  that  once  the  obverse  factors  have  been  identified 
by  the  people  with  high  loadings  on  each  factor,  the  structure  or 
content  of  these  factors  can  he  examined  by  a  relatively  simple  pro¬ 
cedure.  That  is,  to  content  analyze  the  most  (and  least)  descriptive 
items  (in  this  study  the  FIB  items)  making  up  the  protocols  for  the 
persons  with  high  loadings  on  each  obverse  factor.  For  example,  the 
PIB  patterns  suggest  the  following  factor  content:  Factor  I  (Fj) 
nervousness,  frustrated,  ?2  -  good  impulse  control,  socially  sensitive, 
F3  -  cyclothymic,  and  F4  and  Fj,  characterized  by  rather  similar 
neurotic  patterns. 

Relationship  of  fit  to  "obverse  types"  and  test  validity. 

The  rationale  for  this,  the  central  part  of  the  study,  is  that  the 
degree  to  which  the  persons  within  a  subgroup  fit  a  "failure"  pattern, 
to  that  degree  will  "success"  predictors  lose  validity.  Conversely, 
to  the  degree  the  persons  within  a  aub-group  fit  a  "success"  type,  to 
that  degree  will  predictive  validity  be  enhanced. 
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To  test  this  proposition  the  following  procedure  was  implemented: 

The  PIB  was  administered  to  one  enlisted  Submarine  School  class 
of  322  men  (none  of  the  20  men  used  in  the  obverse  factor  analysis 
were  Included  in  this  sample).  With  the  it^m  responses  in  Stanine 
fora,  three  scoring  methods  were  applied  to  the  PIB  items  most  and 
least  descriptive  ot  the  persons  defining  the  obverse  factor  types. 

These  were:  (1)  The  sum  of  those  items  least  characteristic  of  the 
factor  subtracted  from  the  sum  of  those  items  most  characteristic; 

(2)  the  simple  sum  of  those  item  responses  most  characteristic;  and 

(3)  the  ratio  of  the  sum  of  the  most  characteristic  item-responses 

to  the  sum  of  the  least  characteristic  responses.  Thus,  three  scores 
were  outained  for  each  of  the  five  factors  identified  in  Table  1. 

Since  all  of  the  obverse  factors  except  Factor  II  were  associated 
with  submariner  "attrites"  and  Factor  II  with  submariner  "successes”, 
the  question  was  raised  as  tc  whether  any  of  these  scores  based  upon 
the  personality  test  configurations  delineated  by  obverse  factor 
analysis  could  differentiate  between  the  buccess  and  failure  sub-groups 
of  an  incoming  class  of  submariners.  Table  2  contains  the  results  of 
this  analysis. 

It  is  an  interesting  fact  that  all  three  keys  for  in  Table  2 
were  significantly  discriminatory  between  those  who  succeed  and  fall 
in  Submarine  School.  Moreover,  two  of  the  3  keys  w->re  discriminating 
for  F5.  One  reason  for  the  validity  of  these  two  factor  scores  quite 
probably  was  the  greater  number  of  identifying  PIB  items,  10  for  Fj 
and  14  for  Fj,  as  compared  with  6  items  each  for  the  remaining  factors. 

The  greater  number  of  items  in  the  factor  clusters  for  F^  and  F5  quite 
probably  resulted  in  enhanced  reliability  of  the  derived  scores. 

The  final  part  of  this  study  (and  perhaps  the  most  interesting) 
consisted  of  an  examination  of  the  effects  of  grouping  "good"  and 
"poor"  population  fits  to  the  criterion  types  upon  the  predictive 
validity  of  the  Navy  Arithmetic-Mechanical  Aptitude  Test  with  respect 
to  the  Submarine  School  pass-fall  criterion.  The  methodological 
hypothesis  entertained  here  was  that  to  the  degree  to  which  the  persons 
within  a  population  sub-group  fit  a  failure  pattern,  to  that  degree  will 
"success"  predictors  lose  validity.  Conversely,  to  the  degree  that  persons 
within  a  group  fit  a  "success"  type,  to  that  degree  will  predictive 
validity  be  enhanced.  Keeping  in  mind  that  within  the  total  group  the 
tetrachoric  correlation  between  the  aptitude  test  ar.d  the  criterion  is 
.40  the  data  in  Table  3  would  seem  to  be  suggestive. 

Looking  at  Table  3,  it  is  well  to  recall  that  Factors  1,  3,  4  and 
5  were  all  defined  by  failure  types,  and  F2  by  success  types.  Looking 
at.  Fj  for  all  scoring  keys,  it  is  noted  that  within  the  population  sub¬ 
group,  fitting  the  failure  type  depicted  by  the  items  identifying  F^, 
the  correlation  with  the  criterion,  for  all  keys,  is  significantly  lower 
than  the  same  predictor-criterion  relationship  within  those  persona  who 
fit  this  failure  type.  One  the  other  hand,  the  data  for  F2,  a  success 
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TABLE  2 


Comparison  of  Factor  Scores  for 
281  Graduates  and  41  Failures  in 
Submat lne  School 


— 

— 

Scoring  Key  I 

— 

Scoring  Key  II 

- - - - —  - - -  - - ■«  -  . . . 

Scoring  Key  III 

Factors 

t-ratio 

a 

P 

t-ratio 

P 

t-ratio 

P 

F1 

mam 

.005 

2.80 

.005 

■KQI 

<.001 

F2 

n.s. 

0.03 

n.s. 

tV  •  8  ♦ 

F3 

n.s. 

1.1 

n.s. 

0.22 

n.s. 

F4 

1 

n.s. 

0.02 

n.s. 

0.12 

n.s. 

F5 

1.77 

.04 

2.40 

.01 

0.83 

n.s. 

aProbabillty  based  upon  a  "one-tailed"  hypothesis  consistent  with  the 
content  of  each  factor. 


TABLE  3 

Tetrachoric  Correlation  Coefficients  of  Arithmetic 
plus  Mechanical  Scores  with  Submarine  School  Stanines 
for  "Good"  and  "Poor"  Fits  to  the  Criterion  Types 
Identified  by  Obverse  Factor  Analysis  (Total  N  -  277) 


Scoring  Key  I 

Scoring  Key  II 

Scoring  Key  III 

rtetra 

rtetra 

r  _ 
tetra 

High* 

.38b 

.  39b 

.39b 

Fi 

f  nu 

.54 

.54 

C  • 

F„ 

High 

.52 

.54 

.61b 

2 

Low 

.46 

.48 

.46 

High 

.46b 

.34b 

.44 

P3 

Low 

.61 

.57 

.50 

High 

'7b 

,38b 

.58b 

F4 

Low 

.41 

.59 

.43 

High 

.41b 

.40b 

.54 

F5 

Low 

.59 

.57 

.48 

aKigh-Low  signifies  above  and  below  the  approximate  median  of  the  distributions  of 
scores  derived  for  each  group  of  items  associated  with  each  factor. 

^Differences  between  coefficients  significant  at  less  than  the  5%  level  (one-sided  test) 
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type,  indicate  that  those  persons  who  fit  the  success  type  (scoring 
key  III  only)  show  a  higher  predictive  relationship  with  the  criterion 
than  those  who  do  not  fit  the  success  type,  or,  more  exactly,  fit  the 
success  pattern  less  well.  F3,  also  a  failure  type,  shows  for  2  of 
the  three  scoring  keys  the  same  sort  of  discrepancies  in  predictive 
validity  that  were  observed  in  Fj,  that  is,  if  we  hold  the  sub-groups 
constant,  we  obtain  an  increase  in  validity  for  those  Ss  who  did  not 
fit  the  failure  classification  as  defined  by  the  analysis.  For  F^, 
only  scoring  key  II  yields  a  significant  difference  between  the  co¬ 
efficients  in  the  predicted  direction.  For  seasons  unknown,  scoring 
keys  I  and  III  for  F,  yield  differences  in  the  reverse  direction. 

On  the  other  hand,  both  cf  the  statistically  significant  keys  for 
F5  are  in  the  predicted  direction. 

SUMMARY  AND  CONCLUSIONS 

There  were  two  general  questions  to  be  answered  by  this  study: 

First,  la  it  possible  by  means  of  obverse  factor  analysis  to  classify 
persons  meaningfully  in  terms  of  differences  in  personality  test  item 
configurations?  Secondly,  having  isolated  these  person-factors,  what 
happens  to  predictive  validity  within  population  sub-groups  showing 
good  and  poor  "fits"  to  these  classes? 

There  was  one  fundamental  assumption  underlying  this  study,  viz., 
that  variables  interrelate  differently  within  different  classes  or  types 
of  Individuals.  It  follows,  therefore,  that  only  when  the  test  battery 
contains  a  single  common  factor  and  the  criterion  population  is  homo¬ 
geneous  in  the  sense  that  the  interaction  matrix  of  the  variables  (items 
of  a  questionnaire,  in  this  particular  study)  are  equivalent  for  all 
criterion  sub-groups,  will  a  given  prediction  formula  be  maximally 
effective  for  the  total  criterion  population.  It  was  therefore  the 
burden  of  this  study  to  show  that  obverse  factor  analysis  is  one  pos¬ 
sible  way  to  group  the  total  criterion  population  into  homogeneous  sub¬ 
groups  within  which  differential  predictive  validity  may  be  found. 

What  did  the  results  show? 

The  major  finding  is  contained  in  Table  3.  The  data  in  this  table 
demonstrated  differential  predictive  validity  of  the  Arithmetic-Mechanical 
Aptitude  scores  within  "good"  and  "poor"  fits  to  the  adjustment  classes 
identified  by  the  obverse  factor  analysis.  It  was  shown  that  within  a 
group  of  submariner  volunteers  who  fit  the  "failure"  class,  the  aptitude 
predictive  validity  is  significantly  lower  than  for  those  who  fit  the 
failure  class  less  well.  At  the  same  time  the  validity  of  the  same  pre¬ 
dictor  within  the  group  fitting  the  success  "type"  was  higher  than  was 
found  for  the  group  failing  to  fit  the  success  type. 

These  findings  would  seem  to  be  in  accord  with  expectations.  Factorial 
composition  (i.e.,  what  dimensions  the  test  items  are  tapping)  would  seem 
to  depend  upon  the  characteristics  of  both  tests  and  persons  tested  and 
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that  differential  predictability  *121  be  found  for  the  sub-groups 
conpounded  within  the  population  sample.  Actually  a  different 
prediction  formula  is  needed  for  each  of  the  population  sub-classes. 

In  addlrton  co  identifying  trait  clusters  descriptive  of  persons 
who  tailed  in  Submarine  School,  the  results  of  this  study  suggested 
a  sotaewhat  different  approach  to  submariner  selection  (Weybrew  and 
Kinsey,  1968).  After  applying  the  selection  battery  to  the  volunteer 
population,  the  total  group  should  be  subdivided  into  classes  according 
to  the  goodness  of  fit  to  the  success  or  failure  trait  patterns  as 
defined  empirically  as  outlined  in  this  study.  Then,  according  to 
the  findings  of  this  study,  maximal  prediction  efficiency  will  be 
obtained  by  deriving  a  separate  prediction  formula  for  each  population 
sub-grouping. 

In  short,  it  appears  that  workers  in  personnel  selection  (Including 
submariner  selection)  most  probably  are  consistently  underestimating  the 
validity  of  their  predictor  measures  as  a  result  of  a  number  of  hetero¬ 
geneous  groups  being  included  in  the  population  sample.  Once  the  sample 
has  been  classified  according  to  tactorlal  composition  (and  an  obverse 
factor  analytic  technique  is  one  possible  way  to  do  this),  predictability 
within  population  sub-groupings  should  be  greatly  enhanced.  Following 
somewhat  the  paradigm  outlined  in  this  study,  more  profitably,  perhaps, 
by  isolating  population  sub-classes  by  means  of  a  more  varied  configura¬ 
tion  of  measures,  not  only  psychiatric  screening  items  as  used  in  this 
study,  but  also  aptitude.  Interest,  and  values  scores  as  well  as  selected 
iremc  of  biographical  information,  it  should  be  possible  to  raise  single 
and  multiple  validity  coefficients  to  the  0.80  -  C.90  range  for  certain 
predictive  problems. 
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Improving  Estimates  of  the  Standard  Error  of  the  Mean 
by:  Donald  0.  Shepherd,  Robert  l.  Malone  and  Jack  A.  Kavanagh 

Loyola  University  of  Chicago 


Statistical  estimates  of  the  standard  error  of  the  mean  are  apt  to  be  some¬ 
what  erroneous  when  they  fail  to  consider  the  measurement  error,  or  conversely 
the  reliability,  associated  with  the  items,  scales,  or  operations  by  which  such 
measurement  is  attempted. 

In  the  case  of  test  items,  as  this  paper  illustrates,  estimations  of  the 
standard  error  of  the  mean  are  more  accurate  when  based  on  a  formula  containing 
a  correction  for  error  of  measurement. 


While  estimates  of  the  standard  error  of  the  mean  have  traditionally  recog-, 
nized  variance  as  a  function  of  the  sample  of  people,  Peters  and  Van  Voorhlsl, 
as  well  as  Shepherd  and  Wlniewicz2  have  cited  the  importance  of  acknowledging 
variance  associated  with  the  sampling  of  the  test  items.  As  the  present  paper 
Illustrates,  each  type  of  variance,  i.e.  people  or  tests,  may  be  isolated  under 
special  conditions.  For  example,  Peters  and  Van  Voorhls  (p.  134  Formula  66) 
give  as  the  standard  error  of  the  mean  in  the  case  of  the  correlated  test  samples 
matched  on  an  infallible  criterion  the  following  formula: 


I) 


0* 


/“n T 


/I  -  r 


l 


where  r  is  the  reliability  coefficient  of  the  test. 

Peters  and  Van  Voorhls  (pp.  1 3^“ 1 35)  go  on  to  state:  "We  would  have  such 
matching  on  a  true  criterion  where  the  same  group  was  to  be  retested,  for  here  the 
paired  individuals  are  the  same  persons:  consequently  truly  paired  as  to  ability. 
The  variability  of  the  means  to  be  expected  if  we  should  repeatedly  retest  the 
same  group  is  probably  what  we  usually  have  in  mind  when  we  think  of  the  standard 
error  of  a  mean;  hence  formula  (I)  is  the  ore  most  often  to  be  used." 

Clearly,  what  Peters  and  Van  Voorhls  are  using  in  the  above  formula  for  r  is 
the  relie*  'llty  coefficient  and  that  matching  on  an  infallible  or  true  criterion 
is  testing  the  same  group  twice,  i.e.  the  test-retest  reliability  coefficient. 

The  "variability  of  the  means"  Is  a  consequence  of  measurement  error  as  the  varia¬ 
bility  cannot  be  due  to  the  sampling  of  people. 

*Peter<;,  C.C.,  and  Van  Voorhis,  W.R.;  Statistical  Procedures  and  Their  Mathe¬ 
matical  Bases  (New  York:  McGrawHIH,  IJkoTi 

2 Dona  id  0.  Shepherd  and  Casimer  3.  Winiewicz  "Comp 1 eat  Formula  for  the 
Standard  Error  of  the  Mean",  Proceedings,  79th  Annual  Convention,  American  Psycholo¬ 
gical  Association,  1 971,  pp.  97”9®. 
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Peters  and  Van  Voorhls  used  the  term  ’’standard  error  of  the  mean"  while  using 
either  the  formula  for  the  sampling  of  people  (classified  formula  p.  130; 


*v 


or  for  the  sampling  of  test  items  or  forms  (measurement  error  formula  p.  I3*i): 


3) 


o; 


Formula  (2)  is  to  he  used  for  determining  the  variance  of  the  sample  means 
for  a  single  test  form  where  the  thing  that  varies  is  the  sample  of  people. 

Formula  (3)  is  to  be  used  where  the  sample  of  people  remains  constant  (i.e.  a 
population),  but  the  sample  of  test  forms  (items)  varies. 

Actually,  however,  most  situations  entail  a  sampling  not  only  of  people  from 
some  population,  but  a  sampling  of  items  from  some  domain  of  test  items.  Hence, 
realistically,  it  Is  most  important  to  deal  with  both  types  of  variance  simulta¬ 
neously  as  provided  in  the  hypothetical  example  explored  in  the  remainder  of  this 
paper. 

For  example,  let  us  consider  a  minature  representation  as  foHows  in  which 
tests  as  well  as  people  differ  from  sample  to  sample: 

The  total  population  (N)  is  4  and  sample  size  (n)  is  3.  The  people  are 

represented  by  the  letters  A  througn  0.  The  domain  of  the  test  Items 

(K)  Is  5,  sample  size  (n)  is  3.  and  length  of  the  test  form  (k)  is  3* 

The  sizes  of  the  universes  (people  and  test)  ere  here  kept  finite  for 

practical  and  illustrative  purposes. 

Table  I  (following)  contains  the  population  of  people  and  test  items.  It 
also  has  each  individual's  response  to  each  Item  In  the  domain  of  test  items. 

Table  II  contains  every  possible  sample  of  people  of  size  3  end  every  possible 
sample  of  test  Items  of  size  3  (test  form). 

Table  HI  contains  the  score  for  each  person  by  test  form.  At  the  far  right 
under  T  is  each  persons  "t’-uo"  score  calculated  from  the  sample  of  tests.  The 
population  mean  for  each  unique  test  form  is  the  bottom  row  of  scores  (tic),  i.e. 
population  means  for  each  test  form  calculated  across  all  persons.  Finally, 
across  all  persons  and  a?1  test  forms  the  overall  population  "true  mean"  (j!j)  is 
1.50.  Relevant  error  variances  and  other  data  have  been  computed  and  are  re¬ 
corded  at  the  bottom  of  Table  III, 

Table  IV  contains  the  sample  means  of  every  possible  combination  of  people 
(in  groups  of  3)  and  test  form.  For  example,  the  sample  mean  of  1,67  at  the 
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intersection  of  Form  I  and  sample  ADC  was  determined  by  taking  the  scores  that 
individuals  A,  3  and  C  earned  on  those  items  {l,  2,  and  3)  comprising  test  form  I. 
A;  the  far  right  of  Tabie  IV  arc  iIh  true  means  (f)of  each  of  the  samples*  Each 
true  mean  is  the  avearge  of  the  sample  means  across  all  possible  test  forms.  The 
means  of  each  form  are  recorded  as  the  bottom  row  as  )i  .  The  population  true  mean 
(U  )  is  1*50.  Relevant  error  variances  have  t*een  computed  from  the  data  as  well 
as^stlmated  from  the  "Compleat  Formula"  at  the  bottom  of  the  table  (Refer  to 
Note  **  on  Table  IV.) 


The  reliability  o*  the  means  (Table  IV)  is  less  than  the  reliability  of  the 
population  (0.33  V.S.  (1.60).  This  is  caused  by  the  fin(teness  of  the  population! 
The  true  variances  of  the  meins  is  a  function  of(„n_))  but  the  error  variance 

ITT 

Is  not.  As  stated  in  the  Compleat  Formula  for  the  Standard  Error  of  the  Mean: 

ft  *  n  1 

1  •  —  )  * 
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As  noted  on  Tabie  IV  where  n=>3,  the  standard  error  of  the  mean  as  determined 
by  the  data,  is  0.15.  Substituting  the  appropriate  data  from  Table  III  into  the 
preceding  equation  yields: 


o. 

x 


l 


111  )  ♦  »  0.05  ♦  o.io  ■  o.i5 

1  3 


Peters  and  Van  Voorhis  (p.  162,  formula  89)  indicate  the  correlation  of 
sample  means  is  the  same  as  the  correlation  for  the  population.  This  is  now  seen 
to  be  an  incomplete  relationship.  The  compleat  formula  for  the  correlation  o'* 
reliability  of  sample  means  can  nc  easily  derived.  The  Compleat  Formula  for  the 
Standard  Error  of  the  Mean  defines  error  variance  of  the  sample  means  *«: 
o  1 

(1  -  r*,,).  and  variance  of  the  sample  mean  as: 


»j  n  -  r  ^  ) 

n  **  N-l 


Substituting  these  Into  a  traditional  formula  for  reliability: 

—  0  -  '*  J 


1  -  ,onc  arrivics  at:  r  3  -  v-—. 
0  *  v  %  * 


•p,,,  °*  cancels  leaving: 
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TABUS  I 


IIy«oM>oticnl  Koaponno  Matrix  of  Four  Persona 


Five  To at  Items 


Person 


Item 

2  3 


YADLK  n 


Hypothetical  Composition  of  Four  forson  Samples 

and 

Tor  Tost  Samples 


Composition  of  Four 

Hypothotiool 

Samples  of  Persons 

(b)  Composition  of  Ton 
Hypothetical 

Tost  forms  of  three 
items  eaoh 

Stun  pi  os 

Tost  Itom 

Form  Numbers 

ABC 

I  -  1,2,3 

ABO 

II  -  1,2,4 

A  C  D 

III  -  1,2,5 

BCD 

IV  -  1,3,4 

V  -  1,3,5 
VI  -  1,4,5 
VII  -  2,3,4 
VIII  -  2,3,5 
IX  -  2,45 
X  -  3,4,5 


XX 


Substituting  appropriate  data  from  Table  (3)  into  the  above  formuia  yields i 


1  -  0.60 
1  -  (0.60)  M 


The  result  is  exactly  as  calculated  from  Table  (M .  it  is  easy  to  see  that 
where  N-inflnlty,  tbe  ■  r^  it  should  also  be  noted  that  where  n«H,  then 

r. -  ■  0.00  unless  r _ ■  1.00.  In  that  case,  r-»  would  be  indeterminate  (1-  2). 

XX  XX  XX  A 


The  comparisons  presented  in  Table  IV  clearly  illustrate  the  usefulness  of 
estimating  the  standard  error  of  the  mean  by  the  "compleat  formula"  which  com* 
bines  variances  associated  with  both  people  and  tests  simultaneously.  These 
results  imply  that  the  "compleat  formula"  should  be  used  in  all  cases  when  the 
population  is  finite  and  the  reliability  is  not  perfect. 
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A  METHODOLOGY  FOR  ESTIMATING  THE  COST-EFFECTIVENESS 

OF 

ALTERNATIVE  PRETESTS 


BY 

JACK  H.  HILLER,  Ph.  0. 

ARMY  RESEARCH  INSTITUTE  FIELD  UNIT,  PRFSIDIO  OF  MONTEREY,  CA 


The  purpose  of  the  research  reported  here  was  to  develop  a  method¬ 
ology  for  measuring  the  cost-effectiveness  of  alternative  pretesting 
procedures  so  that  an  optimal  procedure  may  be  selected.  The  research 
was  accomplished  as  follows:  Variables  that  affect  the  amount  of  time 
saved  or  lost  by  employing  pretests  were  identified  and  defined.  An 
algebraic  model  which  takes  into  account  measurement  accuracy  and  the 
affect  of  pretesting  time  was  constructed  so  that  the  amount  of  time 
saved  (or  lost)  by  pretesting  could  be  estimated.  Alternative  pretest 
procedures  were  formulated.  A  limited  sample  of  empirical  data  was 
gathered  to  test  the  cost-effectiveness  of  the  alternative  pretest 
procedures,  using  a  highly  efficient  data  collection  procedure.  Esti¬ 
mates  of  the  sampling  distributions  for  the  variables  in  the  cost- 
benefit  model  obtained  from  the  empirical  data  were  used  to  perform  a 
Monte  Carlo  study  of  the  cost-benefit  values  for  the  alternative  pretest 
procedures. 


A  METHODOLOGY  FOR  ESTIMATING  THE  COST-EFFECTIVENESS 

OF 

ALTERNATIVE  PRETESTS 
BY 

JACK  H.  HILLER,  Ph.  D 

ARMY  RESEARCH  INSTITUTE  FIELD  UNIT,  PRESIDIO  OF  MONTEREY,  CA 


A  system  for  providing  on-the-job  individual  skill  training  to 
Infantry  soldiers  Is  currently  being  developed  by  the  Ann y  Research 
Institute  under  sponsorship  of  the  Artny  Training  Board.  Certain  key 
features  of  this  system,  which  also  have  relevance  to  this  report,  are 
listed  below: 

--  Performance-oriented  training  and  testing  based  on 
Soldier's  Manual  tasks 

--  Decentralization  of  individual  skill  training  In  which 
the  Immediate  supervisor  is  the  primary  trainer 

--  Pretesting  to  avoid  unnecessary  training 

The  primary  trainer  will  often  be  an  Infantry  squad  leader  who  may 
have  as  many  as  nine  soldiers  in  his  squad.  The  squad  leader  In  his 
role  of  trainer  is  supposed  to  conduct  pretesting  to  enable  more  efficient 
use  of  training  time.  However,  since  the  new  system  requires  performance 
testing  as  well  as  performance-oriented  training,  the  degree  to  which 
pretesting  increases  the  system's  efficient  use  of  time  Is  an  unanswered 
question. 

Consider  the  following  hypothetical  situation.  A  squad  leader  with 
a  nine-man  group  intends  to  have  all  of  his  men  proficient  in  performing 
some  task.  It  happens  that  he  has  no  information  concerning  any  of  his 
men's  ability  to  perform  the  task,  so  he  conducts  a  pretest.  The  pre¬ 
test  chosen  is  a  performance  test  that  may  be  administered  to  only  one 
man  at  a  time  and  requires  about  ten  minutes  to  conduct  for  each  man. 
Altogether  then,  conducting  this  pretest  for  the  entire  squad  consumes 
one  and  one-half  hours  of  the  trainer's  time,  and  during  this  period  the 
squad  members  may  not  be  spending  their  time  usefully.  If  It  turned 
out  that  every  squad  member  failed  the  pretest  and  needed  substantial 
training,  then  no  time  was  saved,  and  one  and  one-half  hours  were  lost. 

Given  that  pretesting  may  not  necessarily  yield  time  sayings,  the 
purpose  of  the  research  reported  here  was  to  develop  a  methodology  for 
measuring  and  predicting  the  cost-effectiveness  of  alternative  pretest¬ 
ing  procedures  so  that  an  optimal  procedure  (including  exclusion  of  any 
pretesting)  may  be  selected  for  a  given  situation. 


This  research  effort  will  be  described  according  to  the  following 
stages: 


a.  Alternative  pretesting  procedures  were  constructed. 

b.  Variables  whie:  May  affect  the  time  saved  or  lost  by  employing 
the  pretesting  procedures  were  Identified. 

c.  A  cost-effectiveness  model  was  formulated. 

d.  An  efficient  data  collection  procedure  was  designed. 

e.  A  computer  program  to  perform  Monte  Carlo  studies  with  the 
cost-effectiveness  model  was  written. 

Each  of  these  stages  is  described  below. 

CONSTRUCTION  OF  ALTERNATIVE  PRETESTING  PROCEDURES 

The  following  procedures  were  considered  for  possible  use  as  pretests: 

a.  Squad  member's  self-estimation  of  task  proficiency; 

b.  Squad  leader's  estimations  of  task  proficiency  of  his  squad's 
members; 

c.  Paper-pencil  criterion-referenced  testing; 

d.  Simulation; 

e.  Performance  testing. 

Self  estimates  are  fast  and  easy  to  acquire,  but  of  uncertain  validity. 
6iven  the  high  levels  of  turbulence  found  in  operational  units,  the 
infrequent  occurrence  of  many  infantry  tasks,  and  the  fallibility  of 
memory,  squad  leader  estimation  was  rejected  for  this  research  effort. 
Paper-pencT!  tests  are  relatively  quick  and  easy  to  administer,  but  have 
uncertain  vaficiTty  for  infantry  troops  due  to  their  verbal  requirements. 
Simulated  testing  was  considered  but  rejected  due  to  resource  require¬ 
ments  beyond  the  scope  of  this  project.  Hands-on  performance  testing  is 
typically  time  consuming,  especially  for  process  testing  which  requires 
observation  of  each  task  step.  But  performance  testing  represents  the 
criterion  measure  for  all  practical  purposes. 

While  there  are  clear  limitations  on  each  of  the  above  pretests,  a 
decision  was  made  to  explore  the  use  of  self  estimation,  paper  and 
pencil  criterion-  referenced  tests,  and  performance  tests. 
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One  possible  approach  to  using  the  three  candidate  protests  would 
be  to  employ  each  one  by  itself.  For  example,  a  soldier  could  be  asked 
if  he  was  able  to  perform  a  task  to  standard,  and  then  either  be  placed 
in  training  if  he  said  "no",  or  assigned  to  some  other  activity  if  he 
said  "yes".  Another  possible  approach  is  to  arrange  the  candidate 
pretests  in  a  systematic  order  to  capitalize  on  their  virtues  while 
minimizing  the  effects  of  their  weaknesses.  Figure  1  shows  an  ordering 
of  the  pretesta  which  may  provide  an  optimal  procedure,  in  terms  of  its 
cost-effectiveness.  What  has  boen  done  was  to  design  a  procedure  which 
provides  the  easiest  and  fastest  pretest  as  the  first  step,  the  second 
easiest  to  administer  as  the  second  step,  and  the  most  time  consuming 
pretest,  the  hands-on  performance  test,  as  t!.e  last  step  for  anyone  who 
is  not  already  eliminated. 

Requiring  soldiers  to  take  the  performance  test  as  the  last  step 
insures  that  no  one  will  falsely  be  considered  proficient.  Theoreti¬ 
cally,  the  only  error  that  can  be  made  with  the  above  ordering  of 
pretests  is  to  assign  a  soldier  to  tracing  when  he  does  not  need  it. 
Sue, i  errors  may  occur  either  because  the  soldier  misjudges  his  true 
ability,  or  because  he  falls  the  paper-pencil  test. 

In  addition  to  the  pretesting  procedure  shown  in  Figure  1,  several 
other  possible  p»ccedures  were  defined  for  this  researcn.  These  were: 

a.  Self  estimate  followed  by  perfosmcnce  test; 

b.  Paper-pencil  cest  followed  by  performance  test; 

c.  Performance  test  alone; 

d.  No  pretesting,  everyone  enters  training. 

These  procedures  are  diagramed  in  Figures  2-4. 

COST-EFFECTIVENESS  variables 


Having  defined  alternative  pretesting  procedures,  it  is  necessary 
to  devise  a  method  for  identifying  an  optimal  pretesting  procedure  for 
various  training  situations.  The  approach  taken  was  'o  identify  variables 
which  may  influence  time  saved  or  lost,  and  then  to  construct  an  algebraic 
model  which  may  be  used  to  calculate  the  time  saved  or  lost.  Key 
variables  that  were  identified  are: 

a.  The  proportion  of  group  members  who  are  proficient  before 
training  is  conducted. 

h.  The  proportion  of  these  proficient  group  members  who  are 
correctly  identified  by  a  pretest,  and  the  proportion  who  are  incorrect¬ 
ly  classified  as  needing  training. 
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c.  The  time  it  takes  to  conduct  the  pretest  procedure. 

d.  The  time  it  takes  to  conduct  the  mandatory  end-of-training 
performance  test. 


e.  The  time  it  takes  to  conduct  the  first  phase  of  training— the 
explanation/demonstration— after  which  a  trainee  may  request  to  be  given 
the  post-training  performance  test. 

f.  The  time  benefit  (or  loss)  accruing  from  pretesting,  which  is 
defined  as  the  total  number  of  man-minutes  saved  by  pretesting. 

The  benefit  equation  is  too  involved  to  explain  in  the  limited  time 
available  here,  but  it  is  presented  and  explained  in  the  handouts  for 
two  kinds  of  pretesting  errors,  that  is,  assigning  a  soldier  to  training 
even  though  he  is  proficient,  or  failing  to  assign  a  soldier  to  training 
despite  tne  fact  that  he  is  not  proficient. 


AN  EFFICIENT  OATA  COLLECTION  PROCEDURE 


3  ■ 


1  : 


1 

i 


The  alternative  pretesting  procedures  described  above  need  to  be 
comparative^  evaluated  from  the  standpoint  of  their  cost-effectiveness. 
Tc  do  this,  a  domain  of  tasks  for  which  training  would  be  conducted  must 
first  be  specified,  as  for  example  the  57  common  or  basic  tasks  included 
In  the  Soldier's  Manual  for  MOS  1 1B.  Given  the  fact  chat  a  data 
collection  effort  Involving  the  four  different  pretesting  procedures 
would  be  time  consuming  and  difficult  to  accomplish,  it  seemed  important 
to  devise  an  efficient  data  collection  procedure.  The  solution  devised 
was  as  follows.  Participating  soldiers  are  asked  to  read  a  description 
of  the  task,  conditions,  and  standards  for  each  task  that  is  sampled 
from  the  Soldier's  Manual.  They  are  then  asked  first  to  estimate  their 
ability  to  perform  it,  second  to  take  a  paper-pencil  test  regardless  of 
their  estimate,  and  finally  to  take  the  performance  test  regardless  of 
their  written  test  result.  The  data  collected  by  this  procedure  nu.v 
then  be  distributed  to  all  four  active  pretesting  alternatives  by  me^ns 
of  a  logic  t-ce  analysis,  thereby  effecting  a  sizeable  economy  In  data 
collection  requirements.  The  analytic  technique  is  Illustrated  by  the 
results  for  a  sample  task  in  Figures  5-8. 

MONTE  CARLO  PROGRAMMING 


Data  have  been  collected  from  a  small  sample  of  soldiers  for  only 
four  Soldier's  Manual  tasks.  The  difficulties  encountered  when  collect¬ 
ing  this  data  from  troops  in  the  field  motivated  planning  to  take  maximum 
advantage  of  such  data.  To  this  end,  a  Monte  Carlo  computer  program  was 
written  to  simulate  the  sampling  distributions  of  predicted  benefits  for 
the  alternative  pretesting  procedures.  The  -..leans  and  v&ri&rms  for  each 
variable  In  the  benefit  equation  were  estimated  f'*om  the  sample  ef 
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soldiers  who  were  pretested  on  the  Soldier's  Manual  tasks.  These 
parameters  were  used  as  input  to  the  Monte  Carlo  program.  The  program 
then  generates  an  estimate  of  the  sampling  distribution  for  the  benefit 
values  of  each  of  the  alternative  pretesting  procedures.  A  one  way 
AHOVA  may  then  be  used  to  *■  test  for  significant  differences  among  the 
mean  benefit  values  associated  with  each  of  the  alternative  procedures. 

RESEARCH  PLANS 

Current  plans  call  for  collection  of  data  from  two  infantry  bat¬ 
talions  using  a  sample  of  16  Soldiers'  Manual  tasks.  Apart  from  any 
specific  results  which  may  be  obtained  from  this  sample,  the  methodology 
described  in  this  paper  may  prove  useful  in  other  applications  where  an 
optimal  cost-effective  procedure  needs  to  be  selected  from  a  set  of 
alternatives. 
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I  INSTRUCTOR  OR  CONDUCT  * 
‘  NEXT  SCREENING  TEST  \ 


Figure  1.  Pretesting  Model  A. 


figure  I- 


EMPLOY  AS  PEER 
'  INSTRUCTOR  OR  CONDUCT  , 
'  NEXT  SCREENING  TEST  • 


Pretesting  Model  B. 
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ADMINISTER 

WRITTEN 

TEST 


CONDUCT 
PERFORMANCE 
TEST _ 


_ GOj, _ 

•  EMPLOY  AS  PEER  ! 
|  INSTRUCTOR  OR  CONDUCT , 
,  NEXT  SCREENING  TEST  j 


Figure  3.  Pretesting  Model  C. 


GO 

_  _  „  _  st.  _ 


EMPLOY  AS  PEER  i 
INSTRUCTOR  OR  CONDUCT  ' 
NEXT  SCREENING  TEST  1 

i 


Pretesting  Model  D. 


N*-35 


Self  Estimate 


Performance 


Fig.  7.  Results  for  task  —  Fncodo/Recode  KAI-61  —  needed  to 
calculate  the  benefit  value  for  Model  B. 


Written 


Performance 


Error 


Fig.  8.  Result  a  for  task  —  Kncode/Decode  KAI.61  —  needed  to 
calculate  the  benefit  value  for  Model-  C. 
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A  Benefit  Model  Applying  Where  "GOs"  May  Be  Misclasalfied  "NO  GO*1 


Listed  below  are  the  variables  used  by  a  cost-benefit  model  in  which  the  ' 

only  measurement  error  is  classifying  a  task  proficient,  or  "GO,"  • 

individual  as  "NO  GO":  * 

T 

< 

B  *  Benefit  defined  as  the  total  number  of  man-minutes  saved  by  \ 

applying  any  pretesting  procedure.  . 

G  ■  The  proportion  of  squad  members  vho  are  able  to  perform  the  { 

task  (i.e.,  who  are  "GO")  before  training  is  given.  { 


Ek  *  The  proportion  of  squad  members  who  are  able  to  perform  the  j 

K  task  but  are  erroneously  classified  by  a  pretesting  procedure  | 

as  "NO  GO."  | 

G  -  Ej,  ■  The  proportion  of  men  in  a  squad  who  are  correctly  | 

classified  by  a  pretesting  procedure  as  "GO"  on  \ 

a  task. 


» 


i 

i 


i 

! 


P  »  The  time  that  it  takes  a  squad  lender  to  conduct  a  pretesting 

procedure  for  his  entire  squad.  j 

i 

C  -  The  time  it  takes  to  give  a  performance  test  or  checkout  >  j 

to  one  man;  C  becomes  zero  when  no  checkout  is  given  a  pretest. 

| 

D  *  Time  to  demonstrate  how  to  perform  a  task. 

N  ■  The  number  of  men  in  a  given  squad. 

The  term  N(G-E^)D  yields  the  amount  of  time  that  is  saved  by  not  "training"  ‘ 

men  who  are  able  to  perform  the  task  without  training.  This  term  yields 
the  primary  time  savings  from  any  pretesting  procedure. 

The  term  NE  0  yields  the  amount  of  time  that  is  wasted  by  "training"  j 

men  who  couid  perform  the  task  without  additional  training. 

r  ,2  .5 

The  term  HP  is  the  time  cost  incurred  by  pretesting.  The  term  NlG-E^J  C 
is  subtracted  from  P  to  provide  a  credit  for  the  time  that  would  have 
been  spent  in  conducting  a  checkout  on  qualified  performers  after 

training,  as  required  by  the  IETS  model.  j  j 

The  benefit  model  is  formed  by  algebraically  combining  the  above  terms  j 

as  follows: 

t  : 

B  -  N(G-En)D  -  nf.nd  -  ?KP-N(G-EN)2C)-  j  | 

i  s 

2  •  > 

“  N[D(C-2En)  -  P  +  N  (G-En)  C]  :  ■ 
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A  Model  for  Estimating  Time  Lost  Whan  "NO  GOs"  May  Be  Misclassified  "CO" 

Presented  next  are  the  variables  and  a  model  Cor  computing  the  time  lost 
by  the  squad  leader  when  he  has  to  conduct  a  special  training  session  for 
individuals  whom  he  had  praviously  incorrectly  classified  as  "GO." 

E  *  proportion  of  squad  members  erroneously  classified  as  "GO"  who 
G  are  actually  "NO  GO." 


R  *  time  lost  by  squad  leader  by  having  to  provide  training  to  the 

NE„  group  after  providing  training  to  the  others,  N(l-E_). 

G  b 

0  “  time  it  takes  squad  leader  to  organize  the  training  (i.e., 
obtain  training  materials,  move  to  a  training  location  and 
set  up  the  training  session)  for  the  NE^  group  that  is  in 
addition  to  the  time  spent  preparing  training  for  the  N(1~E  ) 
group.  G 

M  «*  time  required  by  an  average  soldier  for  supervised  practice 
until  he  masters  the  task. 


R  -  0  +  D  +  M 


L  «  Lime  in  man  minute/,  that  the  squad  leader  loses  when  he  has  to 
conduct  training  for  the  NE_  group  after  he  has  already  taught 

U 

the  NU-E^)  group.  The  group  that  is  losing  the  squad  leader’s 

time  is  N(l-E  ). 

G 

L  -  N(1-Eg)R  -  N(l-EG)(Q+tHM) 


A  Model  for  Estimating  the  Benefit  from  Pretesting  Where 
Both  Kinds  of  Error  May  Occur 


Finally,  a  model  to  compute  <he  benefit  attributable  to  pretesting  where 
both  kinds  of  classification  error  can  occur  (i.e.,  "GO"  misclassified 
"NO  GO'*  and  "NO  GO’’  misclassified  "GO")  is  presented  below. 

B  -  N[D(G-2EN)  -  P  +  N(C-En)2C  -  (1-EG)(Q+IHM)] 


j 
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Development  of  the  Araed  Services  Vocational 
Aptitude  Battery 

Malcolm  James  Ree 

Air  Force  Human  Resources  Laboratory 
Brooks  Air  Force  Base,  Texas 

I.  INTRODUCTION 

•  In  February  of  1966,  a  Joint  services  committee  of  measurement  and 
evaluation  experts  was  formed  and  given  the  responsibility  for  the 
development  and  standardisation  of  a  differential  aptitude  battery  for 
Use  in  a  joint  services  high  school  testing  program.  The  primary  goal 
in  the  development  of  the  battery  was  to  design  a  single  aptitude  measure- 
ment  instrument  which  would  provide  adequate  coverage  of  the  content 
Included  in  Che  classification  batteries  used  by  each  of  the  individual 
armed  services .  , 

II.  ARMED  SERVICES  VOCATIONAL  APTITUDE  BATTERY  (ASVAB) 

The  ASVAB  is  composed  of  aptitude  measures  reflecting  the  content  of 
the  classification  batteries  used  by  the  Army,  Navy,  and  Air  Force  and  to 
one  which  4s  used  in  a  joint  services  high  school  testing  program. 
Accordingly,  the  Army,  Navy,  and  Air  Force  batteties  were  administered  to 
a  random  sample  of  3,900  military  basic  trainees  (Bayroff  &  Fuchs,  1970). 

A  counterbalanced  order  of  administration  was  us.?d  to  prevent  possible 
practice  effects.  Intercorrelations  for  all  test  variables  were  com¬ 
puted  and  served  as  the  basis  for  the  selection  of  aptitude  measures 
common  to  all  r'.tree  classification  batteries.  On  the  basis  of  these 


analyse*,  nine  sub teat*  were  chosen  and  organized  Into  a  battery,  the 

,  h  v 

Araed  Services  Vocational  Aptitude  Battery. 

light  o I  the  nine  ASVAB  sub teat a  t.ere  selected  froa  the  Aray,  Navy, 

•ad  Air  Force  batteries.  The  ninth  subtest' was  a  modification  of  the 
Aray  Coding  Speed  Test.  The  criteria  for  item  selection  In  each  -sub test 
were  aean  ltea  difficulty  level,  a  lover  limit  of  acceptance  in  terms  of 
item  discrimination  level,  and  content  validity.  The  items  for  each  of 
the  nine  subtests  were  arranged  in  ascending  order  of  difficulty  within 
aach  subtest. 

In  September  of  1968,  ASVA.E-1  was  accepted  for  use  in  the  High  School 
Military  Testing  Program.  During  that  same  year,  Vi  tola  and  Alley  (1968) 
developed  Air  Force  aptitude  indexes  for  use  in  the  operational  selection 
and  classification  program. 

In  early  1974,  the  Department  of  Defense  directed  that  the  services 
move  expeditiously  toward  the  use  of  a  common  aptitude  battery  for 
enlistment  qualification.  The  Office  of  the  Assistant  Secretary  of  Defense 
(Manpower  end  Reserve  Affairs)  suggested  that  the  Armed  Services  Vocational 
Aptitude  Batteries  be  redesigned  to  satisfy  enlistment  production  requirements 
of  all  the  services  with  high  school  usage  being  a  secondary  consideration. 

Since  the  introduction  of  ASVAB-1,  several  alternate  forms  have 
been  developed.  ASVAB-1  was  Initially  used  in  the  high  school  testing 
program  and  was  subsequently  replaced  t>y  A3VAB-2.  In  September  1973, 

ASVAB-3  supplanted  the  Airman’s  Qualifying  Exam  Form  J  (AQE-J)  in  the 
Air  Force  Airman  Selection  and  Classification  Program.  \SVAB-4  was 
essentially  a  back-up  instrument  for  use  in  case  of  test  compromise. 


The  contents  of  ASVAB  Forms  5,  6,  end  7  represent  a  substantial 
departure  from  previous  forms.  The  redesign  of  the  ASVAB  was  based  upon 
the  content  of  the  then  current  armed  service  classification  batteries, 
the  uses  being  made  of  this  content,  and  the  services'  future  plans  for 
the  aiodlfled  battery.  A  preliminary  battery  plan  was  developed  at  *he 
Air  Force  Human  Resources  Laboratory  (AFHRL)  for  review  at  the  other  service 
laboratories . 

The  initial  plan  called  for  two  perceptual  tests,  12  cognitive  power 
tests,  and  a  rather  lengthy  inrerest  inventory  culled  from  materials  from  the 
Army  Classification  Inventory  (ACI) ,  the  Navy  Vocational  Interest  Inventory 
(NVII),  and  the  Air  Force's  Vocational  Interest  Choice  Examination  (VOICE). 

It  was  estimated  that  the  battery  defined  in  the  initial  plan  would  require 
over  four  hours  of  testing  time. 

Table  1  presents  the  contents  as  proposed  in  the  initial  plan. 


Table  1.  Preliminary  Plan  for  ASVAB® 


Content  Ares 

Number  of  Items 

Attention  to  Detail 

30 

Numerical  Operations 

50 

Word  Knowledge 

25 

Arithmetic  Reasoning 

25 

Space  Perception 

25 

Mathematics  Knowledge 

25 

Electronics  Information 

25 

Radio  Information 

15 

Mechanical  Comprehension 

25 

Automotive  Information 

25 

Shop  Information 

25 

BioJug-’cal  Science 

15 

Physical  Science 

15 

General  Information 

20 

Intel aat  Inventory 

527* 

Army  Classification  Inventory 

(87) 

Navy  Vocational  Interest  Inventory 

(190) 

Vocational  Interest  Choice  Examination 

(250) 

• 

Total  872 

aEatimatsd  testing  time  -  4  hours  6  minutes. 

*Total  item  pool  before  consolidation,  whets  possible,  of  content 
from  the  three  source  inventories. 


Table  2  shove  Che  final  content  of  the  battery.  It  was  essential 
that  the  battery  be  shortened  fro*  Its  estimated  4  hours  and  6  minutes, 
especially  for  application  in  the  joint  services  high  school  testing 
progrsa.  In  addition,  the  various  recruiting  service  commanders  desired 
a  testing  tine  considerably  shorter  than  required  in  the  preliminary  plan. 
The  content  shovn  ir.  Table  2  was  arrived  at  after  a  series  of  joint  service 
committee  deliberations,  and  was  possible  only  because  of  various  compro¬ 
mises  from  what  would  have  been  considered  optiiaal  by  each  service.  Note 
that  with  the  exception  of  Word  Knowledge,  all  the  scales  in  the  final  plan 
were  shortened  from  the  originally  planned  number  of  items.  This  was 
because  the  U.S.  Coast  Guard  used  Word  Knowledge  to  screen  personnel  for 
officer  programs  and  believed  that  anything  less  than  30  items  would  be 
inadequate  for  that  purpose. 


Tiu*lo  2.  Final  Plan  *or  ASVAB 


Content  Area 


APor1 


Number 
of  items 


Test  Time 
(in  minutes) 


Attention  to  Detail 
Numerical  Operations 
Word  Knowledge 
Arithmetic  Reasoning 
Spcce  Perception 
Mathematics  Knowledge 
Electronics  Information* 
Automotive  Information 
Shop  Information 
General  Science*1 
General  Information 
Classification  Inventory0 


Total 


a 

10 

20 

12 

20 

15 

10 

8 

10 

7 

2.0 

2  hrs  35  min 


Composed  of  15  Electronics  Information  and  15  Radio  Information  items, 
^Composed  of  10  biological  and  10  Physical  Science  items. 

“^tky  it**  only. 

Atfted  Fdhrea  Qualification  Test. 


*r 


ASVAB-5  is  a  high  school  version,  and  ASVAB-6  and  7  are  the  current 
operational  production  tests.  Jensen,  Massey,  and  Valentine  (1976)  way 


be  consulted  tor  a  more  detailed  description  of  the  scales  and  the  relevant 
normative  studies. 

III.  DEVELOPMENT  OF  ASVAB  FORMS  8,  9,  AND  10 

•  The  following  steps  were  executed  in  the  construction  of  ASVAB-8,  9, 
and  10. 

Initial  I tea  Selection  and  Editing 

Approximately  2,500  item  cards  were  culled  from  the  Air  Force  Human 
Resource  Laboratory  historical  files.  After  review,  revision,  and  editing 
to  Insure  appropriate  wording  and  grammatical  agreement  of  item  stems  and 
item  options,  2,400  items  were  selected.  Most  were  "new"  items,  but  a 
few  had  been  used  in  previous  ASVAB  forms  ov  in  previous  military  selection 
batteries. 

Construction  of  Tryout  Booklets 

These  items  were  asscetblcd  into  16  tryout  tests  cf  approximately 
equal  difficulty.  Two  forms  (A  and  B)  of  each  tryout  booklet  cere  prepared 
so  that  no  item  would  be  always  last  and  thus  frequently  omitted.  The 
testing  time  for  each  booklet  was  90  minutes. 

Administration  of  Experimental  Items 

According  to  a  geographic  sampling  plan,  these  16  tryout  booklets 
were  administered  at  the  64  Armed  Forces  Entrance  and  Examination 
Stations  (AFEESs),  with  each  station  testing  only  four  booklets.  Booklets, 
answer  sheets,  and  administrative  Instructions  were  provided.  The 
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subjects  for  the  tryout  cycle  were  randomly  sampled  applicants  appearing  I 

,  1 

at  the  AFEESs  for  possible  qualification  for  military  enlistment.  The  j 

number  of  subjects  required  from  the  various  AFEESs  ranged  from  28  to  84, 

*  ] 
for  a  total  projected  sample  of  3,200.  Three  weeks  were  allowed  for  this  | 

i 

testing  cycle,  and  answer  sheets  for  2,588  applicants  were  received,  1 

representing  a  loss  of  about  19  percent.  No  systematic  bias  was  found  in  1 

the  returned  answer  sheets,  and  the  sample  was  acceptable  for  tryout  use.  j 

Assembly  and  Administration  of  the  Proposed  ASVAB  Forms  j 

Using  previous  ASVAB  (7)  items  and  subscales  as  models,  throe  new  J 

ASVAB  forms  were  developed  from  items  tried  out  in  the  experimental  testing  \ 

j 

cycle.  The  three  new  forms  closely  resembled  each  other  and  previous  ASVABs  I 

in  terms  of  individual  item  difficulty  and  discrimination  level.  The 
testing  time  for  each  experimental  ASVAB  (8,  9,  &  10)  form  was  2  hours  and 
30  minutes.  • 

The  proposed  ASVAB  (8,  9.  &  10)  forms  were  adminiftered  at  the  64  AFEESs. 

Only  one  form  was  sent  to  each  station  for  experimental  administration  to 
minimise  potential  compromise  problems,  and  each  subject  also  took  th  • 
current  ASVAB  (7).  Subjects  for  this  testing  cycle  were  randomly  selected 
from  the  applicants  for  military  enlistment.  The  numbers  required  from 
each  AFEES  varied  from  52  to  120,  for  a  total  projected  sample  of  6,000. 

Five  weeks  were  allowed  for  this  testing,  and  a  total  of  4,308 
usabla  sets  of  answer  sheets  were  returned,  representing  a  loss  of  about 
28  percent.  Analysis  indicated  that  the  loss  did  not  represent  any  particu¬ 
lar  geographic  or  ability  bias.  These  responses  were  then  used  to  develop 
various  information  about  the  test:  reliability,  average  item  difficulty, 
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and  rough  norm.  A  detailed  aisalyaia  of  ASVAB  construction  tuty  be  found 
in  Fruchter  and  Rea  (1977) . 

IV.  PROVIDING  ASVAB  NORMS 

The  studies  to  provide  normative  conversions  and  tables  for  the  ASVAB 
can  be  divided  into  those  for  high  school  norms  and  those  for  armed  ser¬ 
vices  norms. 

ASVAB-5  was  standardised  on  a  high  school  sample  of  35,291  male  and 
female  students  in  grades  9  through  12.  The  sample  was  stratified  by 
geographic  area,  school  size,  and  the  percentage  of  minority  enrollment. 
Student  scores  were  then  weighted  to  make  the  sample  represent  the 
national  high  school  population.  This  study  produced  normative  tables  by 
grade  and  sex  for  the  ASVAB  subtests  and  high  school  composites.  This 
enabled  the  ASVAB  to  be  used  as  n  high  school  guidance  tool  as  well  as 
for  military  enlistment.  A  detailed  dercription  of  the  developsmnt  of 
high  school  norms  may  be  found  elsewhere  (Adkins,  1976). 

Jensen,  Massey,  and  Valentine  (1976)  reported  the  development  of  the 
armed  services'  norms  for  ASVAB-5,  6,  and  7,  Procedures  for  producing 
the  normative  data  included  the  administration  of  ASVAB-5,  6,  end  7  to 
a  nationally  representative  sample  of  applicants  for  military  enlistment 
at  the  64  AFEESa.  Examinees  took  one  form  of  the  ASVAB  (5,  6,  or  7) 
and  either  the  Armed  Forces  Qualification  Test  (AFQT)  composite  from  the 
Army  Classification  Battery  or  the  ASVAB-3  in  a  counterbalanced  adminis¬ 
tration.  From  the  responses  of  approximately  4,500  examinees  evenly 


divided  among  ASVAB-5,  6,  and  7,  a  stratified  sample  of  1,600  res  poses 
was  «jed  to  compute  percentile  equivalents  for  each  raw.  score  value  on 
all  euhteets  and  military  composites. 


V.  STANDARDIZATION  AND  EQUATING  OF  ASVAB  FORMS 

A  sample  of  2,052  male  and  female  high  school  students  in  grades  9 
through  12  from  26  schools  participated  in  a  study  to  standardise  the 
ASVAB-5  tv  ASVAB-2.  Test  administration  conditions  were  uniform  within 
each  echocl  for  the  presentation  of  Form  2  and  Form  5  on  two  consecutive 
daya.  For  reason*  of  maximum  comparability  (Flanagan,  1951),  tests  were 
scheduled  at  the  same  time  and  in  the  same  rooms  on  either  two  consecutive^ 
mornings  oi  two  consecutive  afternoons.  The  tests  were  administered  in  a 
counterbalanced  design  with  an  equal  number  of  subjects  having  the  testa 
administered  in  tha  order  ASVAB-2  then  ASVAg-5,  and  the  order  ASVAB-5  then 
ASVAB-2. 

The  scores  were  equated  by  an  cquipercentile  method  (Angoff,  1971). 

Then,  using  a  technique  originally  implemented  by  Lindsay  an i.  Prichard 
(1971),  the  fitting  of  the  curve  to  the  equated  data  points  was  done  by 

an  intarative  least-squares  regression  procedure.  j 

i 

The  results  of  this  study  were  a  series  of  tables  equating  scores  ! 

j 

I 

of  subtests  and  composites  on  ASVAB-5  to  percentile  scores  of  subtests  j 

and  composites  on  ASVAB-2.  These  conversion  tables  were  developed  for  1 

'  I 

i 

each  grade,  for  males,  for  females,  and  for  both  sexes  (Fletcher  A  Ree,  j 

1576). 

< 

ASVAB  scores  hove  also  been  equated  to  scores  ot.  the  Differential  j 


s-’VIJy'-'SHS1 1  Hv-s-'- 


Aptitude  Test  Battery  (DATB)  and  the  Ceneral  Aptitude  Test  Br.ttery  (GATB) 
(Kattner,  1976),  Fron  two  schools,  1,232  ninth  throagh  twelfth  grade 
melts  and  females  were  tested  on  either  the  ASVAB  and  DATB  or  on  the.ASVAB 
and  GATB.  The  tests  were  administered  in  a  counterbalanced  order  and 
cten  scored.  Equations  for  predicting  ASVAB  scbtests  from  DATB  sub.ests 
and  for  predicting  ASVAB  subtests  from  GATB  subtests  were  developed.  These 
scores  on  the  DATB  or  GATB  were  then  equated  to  ASVAB  scores.  This  helped 
in  understanding  the  nature  of  all  the  tests  involved  and  provided  the 
counselor  with  another  tool  i or  vocational  assessment. 

VI.  FUTURE  PLANS 

The  development  of  high  school  and  operational  foraa  of  ASVAB  will 
continue.  In  addition,  the  development  of  adaptive  testing  technology 
will  be  studied  and  monitored.  Already  Chrqe  sub tests  of  ASVAB,  Word 
Knowledge,  Arithmetic  Reasoning,  and  Space  Perception  (the  AFQT)  have 
been  adapted  for  presentation  via  computer  media  (Rce,  1977).  As  this 
ares  of  technology  develops,  its  application  to  meeting  the  goals  of 
ASVAB  testing  will  be  explored. 


1 

j 

1 

I 

i 

1 
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INTRODUCTION 


Problem 

The  Asmed  Services  Vocational  Aptitude  Battery  (ASVAB),  Forms  6  and  7  have 
been  used  foe  selection  and  initial  assignment  of  recruits  by  all  Armed  Serv¬ 
ices  under  the  Department  of  Defense  since  January  1976.  This  Battery,  which 
contains  tests  similar  to  those  in  the  earlier  classification  batteries  of  the 
various  services,  has  been  validated  in  only  a  small  portion  of  Navy  schools^ 
In  order  to  maintain  effective  standards  for  Class  A-school  selection  based  on 
ASVAB  tests,  the  Bureau  of  Naval  Personnel  has  requested  additional  evaluation 
of  the  ASVAB  for  predicting  performance  in  Class  A-schools.  \  recent  increas¬ 
ingly  important  problem  has  been  the  unacceptably  high  attrition  from  Basic 
Electricity  and  Electronics  (BE&E)  School.  A  special  effort  to  reduce  attri¬ 
tion  by  changes  in  selector  tests  was  made  for  a  set  of  electromechanical 
ratings  in  conjunction  with  the  corresponding  A- school*,. 


APPROACH 


Samples 

The  ASVAB  was  administered  by  classification  testing  personnel  to  Navy 
applicants  at  the  time  of  enlistment  at  sn  Armed  Forces  Entrance  and  Examining 
Station,  at  a  mobile  examining  test  site  or  at  a  Naval  Training  Center.  Sub¬ 
sequently,  most  of  the  accepted  applicants  vers  assigned  to  various  Navy  Basic 
end  Class  A-Schools  for  trailing.  Forty-one  A-Schools  were  included  in  this 
validity  study  with  sample  sizes  presented  in  Table  1.  Students  in  most  of 
Included  schools  completed  school  training  by  Decembar  1976.  Students  in  s 
few  schools  completed  school  training  as  late  as  April  1977.  Host  of  those 
samples,  do  not  include  all  or  even  a  majority  cf  students  who  completed 
school  during  1976  for  various  reasons:  (1)  students  beginning  school  train¬ 
ing  before  April  1976  had  entered  the  Navy  when  the  Basic  Test  Battery  rather 
than  the  ASVAB  was  used  for  classification,  (2)  school  criterion  data  were  not 
available  by  the  cut  off  date  for  inclusion  in  the  sample  sets,  and  (3)  ths 
number  of  academic  drops  for  some  schools  is  smaller  than  it  known  to  ba  tha 
case. 

Variables 

1.  Predictors.  All  the  separate  ASVAB  tests  variables  a re  reported  as 
Navy  Standard  Scores  (NSS)  having  a  mean  of  about  SO  and  a  standard  deviation 
of  10  for  an  unrestricted  recruit  population.  The  12  subtests  are: 

General  Information  <GT):  A  15  item  test  of  general  knowledge  which 
includes  questions  on  sports,  outdoor  activirlss,  automobile  mechanics,  and 
history.  Testing  time  is  7  minutes. 

Numerical  Operations  (NO):  Measures  how  rapidly  and  accurately  tho 
examinee  can  add,  subtract,  multiply,  and  divide  small  whole  numbers.  Testing 
* 'me  Is  3  minutes  for  SO  items. 
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Attention  to  Detail  (AD).  This  tests  a  person's  ability  to  pick  out  1 

details  rapidly.  Each  item  contains  two  lines  of  c's  and  o’s.  The  lumber  of  | 

c's  Bust  be  counted  and  the  correct  answer  selected  from  five  alternatives.  \ 

Testing  time  is  5  minutes  for  the  30  items. 

Word  knowledge  (WK):  This  test  presents:  30  vocabulary  words.  The 
examinee  Bust  select  from  four  alternatives  the  word  which  most  nearly  has 
the  same  meaning  ae  the  given  word.  Testing  time  is  10  minutes. 

Arithmetic  Reasoning  (AR).  This  test  consists  of  20  reasoning  problems 
in  sentence  form.  The  examinee  must  solve  each  problem  and  select  the  correct 
answer  from  four  alternatives.  Testing  time  is  20  minutes. 

Space  Perception  (SP):  A  20  item  pictorial  test  consisting  of  flat 
patterns  and  drawings  of  three-dimensional  geometrical  figures*  Broken  lines 
on  the  flat  pattern  show  where  it  is  to  be  folded.  The  examinee's  task  is  to 
select  the  three-dimensional  figure  which  could  be  made  from  the  flat  pattern 
or  to  select  the  flat  pattern  which  represents  the  three-dimensional  figure. 

Each  item  has  four  alternatives.  Testing  time  is  12  minutes. 

Mathematics  Knowledge  (MK) :  A  20  item  test  which  requires  some 
knowledge  of  algebra,  geometry,  fractions,  decimals,  and  exponents.  The 
correct  answer  must  be  selected  from  four  alternatives.  Testing  time  is  20 
minutes. 


Electronics  Information  (El):  A  30  item  test  of  the  examinee's  know¬ 
ledge  of  electrical  and  electronic  components,  principles,  symbols,  and 
diagrams.  The  correct  answer  must  be  salectftd  from  four  alternatives.  Test¬ 
ing  time  is  IS  minutes. 

Mechanical  Comprehension  (MC):  In  this  20  item  test  a  drawing 
illustrates  a  mechanical  principle  and  a  question  is  asked  about  the  drawing. 
The  correct  answer  ^ust  be  selected  from  four  alternatives.  Testing  time  is 
IS  minutes. 

General  Science  (SS) :  This  is  a  20  item  test  of  knowledge  of  physical 
and  biological  science.  Each  item  has  four  alternatives.  Testing  time  is  8 
minutes. 


Shop  Information  (SI):  This  test  consists  of  20  questions  about  shop 
practices  and  the  use  of  tools.  Some  of  these  four  alternative  questions  are 
pictorial.  Testing  time  is  8  minutes. 

Automotive  Information  (AX):  This  test  has  29  questions  about 
automobile  parts  and  their  operations.  Each  item  has  four  alternatives. 

Testing  time  is  10  minutes. 

Along  with  the  twelve  individual  cognitive  ASVAE  tevts  sixty-three  combina¬ 
tions  of  tests  were  included  as  predictors.  These  included  the  four  commonly 
used  Navy  composites  and  five  special  composites  for  indiviudal  Navy  schools 
anf  fify-four  other  combinations  of  ASVAB  tests  to  discover  alternate  composites 
that  might  prove  to  be  more  valid  than  existing  ones.  Of  these  sets,  twenty- 
three  were  i-test  sets,  twenty-five  were  3-test  sets  and  six  were  4-test  sets. 
The  compoistes  ued  by  the  other  services  were  Included  as  variables.  Several 
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of  the  Army  composites  include  one  of  the  four  scales  in  the  Classification 
Inventory  (Cl),  the  thirteenth  test  in  the  ASVAB.  The  Classification  Inventory 
scales  were  not  included  in  the  Array  composites  because  the  Cl  scale  scores 
vare  not  included  on  the  Navy  Enlisted  Master  tape  extract,  the  source  of 
ASVAB  scores * 

2  Criteria.  Class  A-School  criteria  were  obtained  from  individual  schools 
on  a  school  reporting  from  provided  to  the  Navy  Personnel  Research  and  Develop¬ 
ment  Center  (NPRDC).  Final  School  Grades  (FSG)  was  available  for  nineteen 
A-School s.  For  the  other  twenty-two  A-Sehools  using  a  self-paced  mode  of 
instruction,  a  Days-in-Training  (DAYS)  criterion  was  used.  This  was  computed 
from  the  course  starting  ind  completion  dates  reported  for  the  •tudenta.  J 
pass-fail  criterion  was  obtained  for  BE/E  school  students  from  the  Chief  of 
Naval  Education  and  Training. 


Data  Analysis 

Means,  standard  deviations  and  correlations  among  predictors  and  A-School 
criterion  variables  were  computed  for  each  school.  The  validity  for  each  pre¬ 
dictor  was  corrected  for  restriction  in  range  which  occurred  when  students  were 
selected  for  technical  (training. 


Multiple  correlations  were  computed  from  uncorrected  correlations  for  the 
twelve  ASVAB  cognitive  tests  for  each  course.  An  accretion  method  was  used  in 
which  a  multiple  correlation  was  computed  after  the  addition  of  each  test. 


RESULTS  AND  DISCUSSION 

The  basic  validity  data  for  each  of  the  forty-one  A-Schools  are  presented 
in  Table  2.  The  schools  are  arranged  in  alphabetic  order  «ithin  selector 
composite  groups,  with  schools  having  a  Final  School  Grade  criterion  listed 
first  followed  by  those  with  a  Days- In-Training  criterion.  Uncorrected  and 
corrected  aero-order  validites  are  presented  as  well  as  uncorrected  multiple 
correlations  for  all  twelve  ASVAB  cognitive  tests  and  for  the  most  valid  set 
of  five  and  three  tests  ■'n  each  school.  Validities  of  current  selector  cor* 
positea  and  of  tests  included  in  them  are  underlined. 

From  Table  *  it  can  be  seen  that  ASVAB  validity  depends  substantislly 
on  what  criterion  of  school  performance  was  used.  For  the  nineteen  A-schools 
which  hav*3  a  Final  School  Grade  criterion  and  median  uncorrected  validity  of 
the  current  selector  composite  io  .46,  with  a  range  from  .16  to  .69.  The 

^ince  A-school  students  must  have  minimum  classification  test  scores 
usually  above  the  mean,  test  validities  obtained  for  school  samples  are  lower 
than  would  be  found  for  a  sample  with  a  broader  range  of  ability.  The  obtained 
validities  can  be  adjusted  or  "corrected"  to  reflect  what  the  validities  would 
be  for  a  sample  covering  a  full  range  of  ability.  This  also  permits  s  fairer 
comparison  of  test  validities  for  schools  with  different  required  scores  on 
c?  *ssi fixation  tests.  The  formula  used  for  corrected  correlations  are  presented 
in  Guilford,  J.  P.  Fundamental  Statistics  in  Psychology  and  Education,  New  York: 
McGraw  Hill,  1956,  p320-321.  Case  I  was  usod  to  correct  the  validity  of  the 
variable  used  in  selection.  Case  III  was  used  for  other  variables. 
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median  corrected  validity  is  .64,  indicating  substantial  predictiveness  about 
equal  to  that  previously  reported  for  the  Navy  Bast  Test  Battery  (Thomas,  1970). 


Twenty-two  of  the  schools  in  this  analysis  are  self-paced  and  do  not 
compute  a  Final  School  Grade  for  their  students.  In  these  schools  the  course 
is  customarily  divided  into  nodules  of  instruction.  The  student  must  pass  a 
test  on  each  module  with  a  minimum  grade  of  90Z  before  advancing  to  the  next 
module.  For  these  schools  the  median  validity  of  the  ASVAB  selector  composite 
against  a  Days-in-Training  criterion  is  only  -.075,  with  a  rwage  from  .18  to 
-.29.  (A  negative  validity  is  expected  for  the  Days-in-Training  criterion 
since  fewer  days  to  complete  a  course  should  reflect  greater  ability.) 

The  picture  is  not  completely  dismal  for  the  Days-in-Training  criterion. 

For  the  nine  self-paced  schools  using  the  Mechanical  or  Electronics  composites 
the  median  uncorrected  validity  of  the  current  selector  composite  is  -.21. 

The  median  corrected  validity  is  -.39.  While  these  values  are  less  than 
satisfactory  they  are  not  zero  as  is  the  case  for  the  thirteen  self-paced 
schools  using  the  General  Technical  or  Clerical  selector  composites. 

We  do  not  yet  fully  understand  why  the  validities  for  self-paced  courses 
are  so  low.  We  know  of  some  factors  that  could  reduce  the  validities,  but  do 
no  know  to  what  extent  these  factors  are  present.  For  example,  in  discussions 
with  school  administrative  personnel  it  was  learned  that  some  students  could 
have  finished  the  course  earlier  than  they  did,  but  postponed  completion  of 
the  course  untill  the  end  of  a  week  rather  than  finish  early  and  be  assigned 
to  Ceneral  Detail  for  a  few  days  while  awaiting  transfer  to  a  new  duty  station. 

In  some  self-paced  schools  the  variance  of  Days-in-Training  is  small,  not  much 
more  than  it  is  In  sowte  lock-step  courses. 

This  is  a  problem  that  has  not  yet  received  as  much  attention  as  it 
deserves.  At  the  last  MTA  meeting  Dr.  Raymond  Christal  had  some  very  worth¬ 
while  comments  related  to  this  topic  as  well  as  suggestions  for  research 
(Christal,  1976).  It  is  an  especially  important  problem  because  more  and  more 
schools  have  gone  from  a  lock-step  mode  of  instruction  to  a  self-paced  mode  over 
the  past  few  years  and  the  trend  is  continuing. 

The  maximum  validity  of  the  ASVAB  is  whown  for  each  school  by  the  multiple 
correlation  using  all  twelve  ASVAB  tests.  The  median  R  across  the  19  schools 
with  FSC  is  .60,  somewhat  higher  than  the  .46  I  mentioned  earlier  for  the 
median  uncorrected  validity  of  the  current  selector  composite.  There  seems  to 
be  room  for  some  increase  in  validity.  Also  shown  are  the  validities  of  the  most 
valid  set  of  five  and  three  ASVAB  tests.  There  is  very  little  difference 
between  validity  of  the  five  oost  valid  tests  and  all  twelve  ASVAB  tests  and, 
for  some  schools,  not  much  difference  between  the  most  valid  set  of  three  and 
all  twelve  tests.  The  reason  for  listing  validities  of  five  and  three  tests  is 
that  no  current  ASVAB  composite  has  more  than  five  tests,  and  three  is  a  more 
usual  number  of  tests  comprising  a  composite. 

From  examining  the  validities  of  ASVAB  subtests  and  selector  composites 
you  can  see  that  the  operational  composites  do  not  have  the  highest  validites 
for  many  schools.  Zero  order  validities  of  over  fifty  sets  '-f  2-,  3-,  or  4- 
ASVAB  tests  were  compared  with  the  current  selector  composites.  This  inspection 
shoved  that  seldom  did  the  4-test  linear  sums  yield  higher  validities  than  the 
best  2-  and  3-test  sets.  Therefore,  the  most  valid  sets  of  2-  and  3-tests  were 
extracted  and  are  shown  in  Table  3  along  with  the  validities  of  the  operational 
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selector  composites  for  each  school.  It  can  be  seen  'that  for  most  schools 
several,  or  sometimes  many,  sets  of  tests  yield  almost  the  same  validities. 

This  reflects  the  high  relationships  among  many  of  the  ASVAB  tests.  These 
data  were  examined,  along  with  similar  data  from  a  concurrent  ASVAB  validity 
report  on  thirty-one  schools  (Swanson,  1976)  and,  in  some  cases,  earlier 
Navy  Basic  Test  Battery  validity  data  (Thomas,  1970,  1973)  on  the  same 
schools  in  order  to  arrive  recommended  changes  in  selector  composites. 

The  Basic  Electricity  and  Electronics  (BE/E)  course  is  a  prerequisite  for 
over  twenty  A-schools.  There  has  been  an  excessive  attrition  rate  for  BE/E 
students  destined  for  some  of  these* ratings.  Of  particular  Concern  were  four 
electromechanical  ratings — EM,  IC,  CE,  and  Qi.  A  study  by  Dann  and  Abrahams 
(1977)  examined  attrition  in  BE/E  School,  and  considered  these  four  ratings. 

The  present  study  evaluates  the  joint  impact  of  selector  composite  changes  on 
BE/E  and  "A"  school.  There  are  Class  A-school  samples  in  this  study  for  only 
two  of  these  four  ratings— EM  and  CM.  Analyses  of  ASVAB  test  validities  were 
made  for  the  separate  groups  of  BE/E  students  destined  for  the  individual 
schools  and  for  BE/E  students  with  a  common  selector  composite.  Regression 
analyses  was  used  with  hold  out  samples  for  cross-validation.  The  details  of 
the  procedures  and  results  are  presented  in  an  unpublished  report  by  Dann  and 
Abrahams  (1977).  The  validities  of  the  current  selector  composites  and  the 
most  promising  new  selector  composites  with  integer  weights  for  each  test 
rather  than  the  more  precise  regression  weights  for  the  four  school  samples 
are  shown  in  Table  4.  The  validities  o!  the  recommended  selector  composites 
are  enclosed  in  a  dotted  box.  For  Gunner***  Mate  students  the  recommended 
Electronics  composite  yields  an  increase  in  validity  from  .17  to  .37  in  the 
BE/E  school,  where  substantial  attrition  has  occurred.  This  is  accompanied 
by  a  slight  decrcace  in  A-school  validity  from  .46  to  .41.  The  overall  effect 
is  expected  to  be  a  larger  throughput  of  trained  students  into  the  Gunner's 
Mace  rating.  For  EM  students  the  newly  recommended  BE/E  composite,  2MK.+AR+GS, 
improves  BE/E  school  validity  from  .11  to  .41,  which  should  reduce  Bff/E 
school  attrition  without  reducing  the  validity  obtained  with  the  present 
A-school  composite.  There  is  no  1C  A-school  in  this  analysis.  Nevertheless, 
a  change  to  the  new  BE/E  selector  compos* -e  for  both  IC  Class  A  and  BE/E 
studerts  seems  warranted  on  the  basis  of  ..*milarity  in  course  content.  Job 
knowledge  required  and  a  history  of  similar  validities  for  like  tests  for  these 
courses. 

The  Construction  Electrician  sample  siie  is  too  small  to  make  a  sireng  case 
for  change.  Validity  data  for  additional  students  in  these  and  other  BE/E  and 
A-schools  will  he  available  within  a  short  time  to  check  on  these  recommendations 

In  a  least  12  other  schools  the  validities  of  alternate  selector  composites 
wore  sufficiently  higher  than  validities  of  the  prosent  composites  to  be 
considered  for  operational  use.  In  seme  of  these  schools  the  sample  sizes  were 
too  small  to  have  confidence  that  the  alternate  composite  validities  would  hold 
up.  The  schools  in  which  changes  in  selector  composites  appear  warranted  are 
shorn  In  Table  5.  In  two  schools,  MM  snd  EN,  validity  differences  from  present 
composite  validities  are  '..mall.  These  schools  are  nevertheless  included  in 
order  to  have  a  uniform  selector  composite  for  a  set  of  related  mechanical 
schools. 


CONCLUSIONS 


The  following  conclusions  appear  warranted  from  the  analysis. 

1.  Final  school  grade  is  a  more  predictable  criterion  than  Days-in- 
t raining. 

2.  For  schools  with  a  final  school  grade  criterion*  the  prerent  Navy 
ASVAB  composites  are  about  a  valid  as  were  Navy  Basic  Test  Battery  Composites. 

3.  A  single  ASVAB  selector  composite  for  both  BE/E  and  "A"  schools 
will  reduce  BE/E  school  attrition  and  still  be  effective  in  "A"  schools  for  a 
group  of  electromechanical  schools. 

4.  The  large  number  of  sets  of  2  and  3  tests  yielding  very  similar 
validities  suggests  a  lack  of  differential  validity  among  the  A5VAB  teats. 

5.  Effective  school  classification  in  ttv>.  Navy  may  be  accomplished 
with  a  fewer  number  of  subtests  in  the  ASVAB*  through  elimination  or  combining 
of  current  ones. 


6.  Research  on  school  criterion  measures  with  particular  emphasis  on 
self-paced  schools  is  needed. 
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Table  1 


Samples  Included  In  ASVAfi  6/7  Analysis 


N  With  Criteria 


Course  cr  Rating 

Course 

Code 

Location 

Total  Graduate 

Academic 

Drop 

Air  Controlman 

AC 

'  6278 

Memphis 

52 

52 

0 

Aviation  Machinist's  Mate,  Jet 
Aviation  Structural  Mechanic, 

ADJ 

6501 

Memphis 

385 

365 

20 

Hydraulics 

Aviation  Structural  Mechanic, 

AMH 

6517 

Memphis 

78 

78 

0 

Structures 

AMS 

6518 

Memphis 

89 

89 

0 

Aviation  Ordanceaan 

Avionics  Technician,  Aviation 

AO 

6506 

Memphis 

136 

136 

0 

Electronics  Technician 

Aviation  Antisubmarine  Warfare 

AT 

6239 

Memphis 

265 

233 

32 

Operator 

Avionics  Technician,  Aviation 

AW 

6537 

Memphis 

92 

92 

0 

Antisub  Warfare  Technician 

AX 

6241 

Memphis 

60 

36 

4 

Aviation  Maintenance  Administrati 

on  AZ 

6528 

Meridian 

66 

66 

0 

Boiler  Technician 

Communications  Technician, 

BT 

6260 

Creat  Lakes 

753 

701 

52 

Administrative 

Communications  Technician, 

CTA 

6020 

Corry  Station 

57 

48 

9 

Communications 

Communications  Technician, 

CTO 

6053 

Corry  Station 

73 

60 

13 

Collection 

Communications  Technician, 

CTR 

6301 

Corry  Station 

55 

39 

16 

Technical 

Communications  Technician,  Field 

CTT 

6302 

Corry  Station 

118 

89 

29 

Operations  Special  Non-Morse 

CTT 

6320 

Corry  Station 

35 

35 

0 

Dental  Technician 

DT 

6086 

San  Diego 

166 

159 

7 

Electrician's  Mate 

EM 

6070 

Great  Lakes 

169 

169 

0 

Englneman 

Electronics  Technician, 

EN 

6261 

Crest  Lakes 

389 

382 

7 

Communications 

ET 

6263 

Creat  Lakes 

254 

254 

0 

Electronics  Technician,  Radar 
Electronics  Technician, 

ET 

6265 

Crest  Lakes 

202 

202 

0 

Communications 

ET 

6266 

Crest  Lakes 

64 

64 

0 

Fire  Control  Technician,  Missile 

FT 

6027 

Crest  Lakes 

91 

91 

0 

Gunner's  Mate,  Guns 

CM 

6115 

Crest  Lakes 

109 

109 

0 

Hospitalaan 

HM 

6084 

Great  Lakes 

1214 

1126 

88 

Hospitalman 

KM 

6085 

San  Diego 

1079 

1021 

58 

Hull  Maintenance  Technician 

HT 

6119 

San  Francisco 

160 

158 

2 

Hull  Maintenance  Technician 

HT 

6120 

Philadelphia 

289 

287 

2 

Machinist's  Mate 

MM 

6262 

Great  Lakes 

1444 

1411 

33 

Mess  Management  Specialist 

MS 

6125 

San  Diego 

103 

103 

0 

Operations  Specialist 

OS 

6142 

Great  Lakes 

220 

220 

0 

Postal  Clerk 

PC 

6300 

ft,  B.  Harrison  36 

36 

0 

Polaris/Poseidon  Electronics 

PE 

6146 

Dam  Neck 

91 

91 

0 

Photographer 'a  Mate 

PH 

6523 

Corry  Station 

43 

43 

0 

(Continued  on  next  page) 


Table  1  (continued) 


Samples  Included  in  ASVAB  6/7  Analysis 

H  With  Criteria 

Course  Academic 


Course  or  Ratinx 

Code 

Location 

Total  Craduate 

Droi 

Personnelaan 

PN 

6102 

Meridian 

135 

124 

11 

Aircrew  Survival  Equipaentaan 

PR 

6519 

Lakehurst 

76 

75 

1 

Quartermaster 

QM 

6001 

Orlando 

65 

65 

0 

Radioman 

RM 

6144 

San  Diego 

681 

643 

38 

Radioman  S «a  Duty 

RM 

6380 

San  Diego 

225 

225 

0 

Radioaan  Shore  Duty 

RM 

6381 

San  Diego 

221 

221 

0 

Signalman 

SM 

6005 

Orlando 

42 

42 

0 

Yeoman 

YN 

6057 

Meridian 

212 

174 

38 
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Table  2  (Continued) 


Decinal  Points  are  onltted. 

Validities  of  the  cohost te  used  for  eelectlon  and  the  teate  in  it  are  underlined. 
A  negative  validity  ia  expected  for  the  OAtS  criterion. 

r  -  uncorrected  validities,  r  -  Validities  corrected  for  restriction  of  range. 


Table  3 


Bk'Vfl 


Validities  of  ASVAB  6/7  Selection  Composites  and  the  Host 
Valid  Sets  of  Two  and  Three  ASVAB  Tests  cor  41  Schools 


School 

Code 

N 

Criterion 

Validity  of  ASVAB  6/ 
Selector  Composite 
(WK+AR) 

f7 

Most  Valid  Sets 
of  Two  ASVAB  Tests 

Most  Valid  Sets  of 
Three  ASVAB  Tests 

r  r 

r 

r 

r 

r 

u  c 

u 

_c 

u 

c 

AC 

6278 

52 

FSG 

39  67 

AR+AI 

54 

73 

MC+SI+AI 

55 

73 

• 

SI+AI 

54 

72 

WK+MC+SI 

54 

73 

• 

HK4AI 

.51 

70 

AR+MC+AI 

54 

73 

AR-fSI 

50 

71' 

AR+GS+AI 

54 

73 

AW 

6537 

92 

FSG 

23  30 

NO+MK 

41 

44 

MK+EI+GS 

36 

40 

MK+CS 

37 

40 

NO+AD+MK 

35 

39 

» 

MK+EI 

36 

40 

AR+MK+CS 

34 

38 

\ 

NO+WK 

33 

38 

MK+EI+MC 

33 

38 

MK+tIC+SI 

33 

38 

A Z 

6528 

66 

FSG 

54  80 

MK+EI 

68 

85 

MK+EI+MC 

65 

83 

WK+MK 

66 

84 

MK+EI+GS 

62 

82 

HK+MC 

62 

81 

MK+MC+AI 

61 

81 

AR+MK 

61 

82 

AR+MK+GS 

60 

82 

NO+MK 

61 

81 

HH 

6084 

1214 

FSG 

49  73 

WK+MK 

55 

72 

WK+AR+GS 

52 

71 

MK+CS 

53 

71 

AR+MK+CS 

52 

71 

NO+WK 

50 

69 

MK+EI+CS 

50 

70 

HH 

6085 

1079 

FSG 

44  70 

WK+MK 

49 

72 

AR+MK+GS 

49 

72 

MK+GS 

47 

70 

WK+AR+CS 

48 

71 

NO+WK 

45 

69 

MK+EI+CS 

44 

69 

NO+WK 

45 

67 

WK+AR+MC 

43 

69 

AR+MX 

44 

69 

;  HS 

5125 

103 

FSG 

53  79 

NO+WX 

58 

79 

MK+EI+CS 

56 

78 

WK+MK 

55 

79 

WK+AR+GS 

55 

80 

i 

MK+EI 

53 

76 

WK+AR+SI 

52 

78 

i 

MK+CS 

51 

76 

AR+EI+GS 

52 

78 

j 

AR+MK+CS 

52 

78 

os 

6142 

220 

FSG 

32  58 

HK+MC 

45 

64 

MK+MC+SI 

43 

63 

AR+KC 

44 

64 

WK+AR+MC 

41 

62 

NO+AR 

41 

62 

AR+EI+MC 

41 

62 

AR+MK 

40 

62 

AR+MC+SI 

41 

62 

NO+MK 

39 

59 

MK+EI+MC 

41 

61 

MK+MC+AI 

41 

61 

1  SM 

6005 

42 

FSG 

31  57 

AR+MK 

51 

67 

AD+WK+AR 

44 

63 

I 

AR+SI 

47 

65 

AR+MC+SI 

42 

62 

I 

AD+AR 

47 

64 

AR+MK+CS 

41 

62 

1 

AO+MK 

47 

63 

MK+MC+SI 

37 

59 

AR+MC 

43 

63 

>**  wm  a 

1075 


Table  3  (Continued) 


School 

Code 

N 

Criterion 

Validity  of  ASVAB  6/7 

Selector  Composite  Most  Valid  Sets 

(WK+AR)  Of  Two  AS  VAR  T *«*•« 

Most  Valid  Sets  of 
hree  ASVAB  Tests 

r 

u 

r 

c 

r 

u 

r 

c 

r  r 

u  c 

CTO 

6053 

73 

DAYS 

-04 

-09 

MX+EI 

-06 

-11 

AR+EI+MC  -09  -13 

AR+MC 

-06 

-11 

MK+EI+GS  -08  -12 

• 

SP+EI 

-05 

-09 

KK+EI+MC  -07  -11 

• 

MK+GS 

-05 

-11 

CTR 

8301 

55 

DAYS 

10 

‘  18 

None  valid. 

* 

None  valid. 

CTT 

6302 

118 

DAYS 

-03 

-06 

AD+VK 

-20 

-18 

WK+AD+NO  -13  -13 

NO+WK 

-13 

-13 

WK+MC 

-12 

-12 

CTT 

6320 

35 

DAYS 

-15 

-26 

AR+AI 

-23 

-23 

AR+GS+AI  -30  -36 

SI+AI 

-18 

-19 

AR+EI+CS  -29  -35  ! 

MK+GS 

-14 

-26 

WK+AR+GS  -21  -30 

MK+EI 

-14 

-24 

MK+EI+GS  -21  -30 

MK+AI 

-14 

-23 

DT 

6086 

166 

DAYS 

01 

02 

AR+AI 

-13 

-07 

NO+AD+SP  -11  -11 

AD+SP 

-11 

-11 

AR+SP+MC  -11  -07 

SP+tfC 

-11 

-09 

AR+MC+AI  -11  -07 

AR+SP 

-11 

-08 

MC+SI+AI  -11  -07 

SI+AI 

-11 

-08 

f 

PH 

6523 

43 

DAYS 

10 

19 

None  valid. 

i 

None  valid.  j 

PC 

6300 

36 

DAYS 

-02 

-04 

AR+AI 

-22 

-20 

AR+GS+AI  -18  -15 

MK+AI 

-16 

-16 

AR+MC+AI  -10  -10 

NCH-AR 

-13 

-J2 

AR+EI+GS  -09  -09 

GI+AI 

-12 

-12 

AR+MK 

-10 

-09 

PN 

6102 

135 

DAYS 

-11 

-18 

NO+WK 

-22 

-26 

WK+AD+NO  -21  -26 

A D+WK 

-19 

-24 

AD+WK+AR  -16  -22 

HO+AD 

-17 

-22 

NO+AD+MK  -16  -21 

NO+AD+AR  -15  -21 

RM 

6144 

681 

DAYS 

08 

14 

Nona  valid. 

Nona  valid. 

RM 

6380 

225 

DAYS 

18 

31 

None  valid. 

None  valid. 

RM 

6381 

221 

DAYS 

02 

04 

None  valid. 

None  valid 

1075 
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chool 

Code  N 

Criterion 

Validity  of  ASVAB  6/7 
Selector  Composite 
(WK+MC+SI) 

Host  Valid  Sets 
of  Two  ASVAB  Tests 

Most  Valid  Sets  of  1 
Three  ASVAB  Tests  I 

r  r 

r 

r 

r 

r 

u  c 

u 

c 

u  - 

'  _£ 

EM 

6070  169 

FSG 

67  84 

MK+MC 

69 

85 

MK+MC+S I 

72 

87 

MK+CS 

69 

84 

AR+MK+CS 

71 

85 

AR+MC 

67 

84 

AR+EI+GS 

70 

85 

- 

WK+MC 

65 

84 

AR+GS+AX 

70 

85 

• 

MK+EI+GS 

70 

85 

MK+EI+MC 

69 

85 

« 

2MK+AR+GS 

68 

83 

CM 

6115  109 

FSG 

46  73 

AR+AI 

49 

74 

AR+GS+AI 

51 

75 

SI+AI 

48 

73 

AR+MC+AI 

49 

75  * 

GI+AI 

47 

73 

MC+SI+AI 

49 

74 

WK+MC 

45 

73 

WK+AR+CS 

48 

74 

WK+M+GS 

48 

74 

2MK+AR+GS 

29 

62 

BT 

6260  253 

DAYS 

CM 

1 

1 

MK+AI 

-25 

-35 

MK+MC+AI 

-21 

-33 

AR+A1 

-24 

-35 

AR+MK+CS 

—21 

-33 

NO+AR 

-22 

-34 

AR+MC+AI 

-20 

-33 

HQ+MK 

-22 

-33 

AR+GS+AX 

-20 

-32 

MK+EI 

-22 

-33 

HK+EI+MC 

-20 

-32 

AR+MK 

-21 

-33 

2MK+AR+GS 

-21 

-33 

MK+MC 

-20 

-32 

EM 

6261  389 

DAYS 

cl 

l 

CM 

1 

AR+Al 

-25 

-40 

AR+EI+MC 

-24 

-39 

Gl+AI 

-25 

-40 

AR+MC+AI 

-24 

-39 

AR+MC 

-25 

-40 

AR+MK+CS 

-23 

-381 

MK+AI 

-24 

-39 

MK+MC+AI 

-23 

-38! 

MK+MC 

-23 

-39 

AD+WK+AR 

-38| 

MK+EI 

-23 

-38 

2MK+AR+CS 

-23 

~38| 

AR+MK 

-23 

-38 

HT 

6119  160 

DAYS 

-19  -39 

MK+CS 

-25 

-41 

AR+GS+AI 

—26 

-42 

MK+AI 

-24 

-40 

MK+EI+GS 

-25 

-41 

AR+SI 

-24 

-41 

AR+MK+CS 

-25 

-41 

AR+Al 

-23 

-40 

AR+EI+GS 

—24 

-41 

NO+MK 

-23 

-37 

AR-rSP+GS 

-23 

-40 

MK+MC+AI 

-23 

-40 

KO+AD+AR 

-23 

-37 

NO+AD+MK 

-23 

-37 

2MK+AR+CS 

-24 

-40] 

HT 

6120  289 

DAYS 

-11  -21 

AR+MC 

-17 

-26 

AR+EI+MC 

-21 

i 

-28 

AR+AI 

-17 

-25 

MK+EI+MC 

-21 

-28 

MK+AI 

-17 

-25 

AR+EI+GS 

-19 

-27 

MK+MC 

-16 

-25 

AR+MC+AI 

-18 

-26 

MK+EI+GS 

-18 

-26 

MK+MC+AI 

-18 

-26 

2MX+AR+CS 

-16 

-24 
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Table  3  (Continued) 


School  Code  N  Criterion 


Validity  of  ASVAB  6/7 

Selector  Composite  Most  Valid  Sets  Most  Valid  Sets  of 
(WK+MC+SI)  of  Two  ASVAB  Tests  Three  ASVAB  Test# 


MM  6262  1444  DAYS 


-35  -53 


6519  76 


-02  -04 


AR-'-MC 

AR-H-K 

NO+AR 

WK+MK 

MK+EI 

AR+SI 

NO+WK 

MK+MC 

MK+GS 

NO+MK 

AR+AI 

MK+AI 

MK+GS 

MK+AI 

NOfWK 

SI+AI 

GI+AI 

NO+SP 

SP++!X 


-39  -55 
-39  -54 
-39  -53 
-38  -54 
-37  -53 
-37  -54 
-37  -52 
-36  -53 
-36  -53 
-36  -51 
-36  -53 
-34  -51 


WK+AR+MC  -41 
WK+AR+SI  -40 
WK+AR+GS  -40 
AR+MK+GS  -40 
AR+EI+MC  -39 
,  AR+EI+GS  -39 
2MK+AR+GS  -39 
AR+MC+SI  -38 
AR+GS+AI  -38 


-21  -17  MK+MC+SI  -19  -19 

-19  -16  AR+MX+GS  -18  -15 

-19  -17  AR+GS+AI  -18  -14 

-18  -14  MK+EI+GS  -18  -14 

-18  -15  MK+MC+AI  -18  .  -13 

-17  -17  SP+MX+HC  -17  -13 

-17  -15 


School 

Code 

A0 

6506 

Validity  of  ASVAB  6/7 

Selector  Cooposite  Most  Valid  Sets  Most  Valid  Sets  of 
Criterion  (AR+MK+EI+GS)  Of  Two  ASVAB  Testa  Three  ASVAB  Tests 


6506  136  FSC 


41  79 


ET  6263  254  FSC 


46  75 


4265  202  FSG 


52  82 


u 

c 

u 

JL 

AR+MC 

49 

81 

WK+AR+MC 

30 

81 

WK+MC 

47 

80 

AR+EI+MC 

48 

81 

AR+AI 

42 

79 

WK+MC+SI 

48 

80 

MK+MC 

41 

79 

AR+MC+AI 

46 

80 

AR+MC+AI 

46 

80 

MK+MC+SI 

46 

80 

2MK+AR+GS 

32 

77 

MK+EI 

44 

74 

MK+EI+MC 

47 

75 

KK+HC 

42 

73 

AR+EI+MC 

45 

74 

WK+MC 

41 

72 

AR+EI+CS 

44 

74 

AR+MC 

39 

71 

MK+EI+GS 

44 

74 

SP+HK+EI 

43 

73 

2MK+AR+GS 

40 

73 

MK+EI 

51 

82 

MK+EI+MC 

54 

P3 

MK+MC 

50 

81 

MK+71+GS 

52 

82 

AR+EI+MC 

52 

82 

AR+EI-KJS 

51 

82 

2MK+AR+GS 

43 

80 

Table  3  (Continued) 


Validity  of  ASVAB  6/7 

Selector  Composite  Host  Valid  Sets  Host  Valid  Sets  of 


School 

Code 

N 

Criterion 

fAR+MK+EI+GS) 

Of  Two 

ASVAB  Tests 

Three  ASVAB  Tests 

r 

u 

i 

c 

r 

u 

X 

Cr 

'  r 

c 

ET 

6266 

64 

FSG 

14 

26 

SP+MK 

21 

30 

AR+SP+GS 

.22 

31 

MK+GS 

21 

30 

SP+MK+MC 

„  22 

31 

SP+MC 

20 

29 

WK+SP+MK 

21 

31 

- 

MK+MC 

20 

29 

MK+MC+SI 

19 

,  29 

WK+5P 

19 

28 

2MK+AR+GS 

18 

28 

. 

NOfSP 

19 

27 

FT 

6027 

91 

FSG 

47 

80 

MK+EI 

43 

7V  * 

MK+EI+GS 

48 

81 

MK+GS 

40 

79 

AR+EI+GS 

45 

80 

AR+GS+AI 

45 

80 

AR+SP+GS 

45 

80 

WK+AR+GS 

44 

80 

2MK+AR+OS 

*  36 

77 

PE 

6146 

99 

FSG 

39 

76 

MK+GS 

46 

78 

MK+EI+CS 

45 

77  ! 

MK+EI 

45 

77 

MK+EI+MC 

44 

77  i 

MK+MC 

43 

76 

2MK+AR+GS 

40 

76  | 

SP+MK 

33 

71 

SP+MK+EI 

39 

75  ! 

* 

WK+MK 

32 

72 

MK+MC+SI 

37 

74  : 

«•  mm 

* 

AR+MK+GS 

36 

75  I 

ADJ 

6501 

365 

DAYS 

-36 

-60 

WK+MK 

-36 

-59 

AR+MK+GS 

-36 

j 

-60  i 

MK+MC 

-34 

-58 

7MK+AR+GS 

-36 

-59  j 

MK+GS 

-33 

-58 

AR+GS+AX 

-33 

-58  1 

AD+WK+AR 

-33 

-56  | 

AT 

6239 

265 

DAYS 

-26 

-52 

MKHIC 

-26 

-51 

AR+MK+GS 

-29 

'53  | 

MK+GS 

-26 

-52 

2MK+AR+GS 

-28 

-53  \ 

AR+MK 

-25 

-51 

AR+SP+GS 

-25 

-51  I 

AK+MC 

-25 

-51 

WK+AR+MC 

-24 

-51 

NO+AR 

-24 

-46 

AR+EI+MC 

-24 

-51  1 

AR+EI+GS 

-24 

-51  | 

AX 

6241 

60 

DAYS 

-25 

-55 

WK+AR 

-42 

-62 

AD+WK+AR 

-31 

-56  | 

WK+MK 

-33 

-58 

WK+MC+SI 

-30 

-57 

WK+MC 

-29 

-56 

WK+AR+ST 

-24 

-55 

WK+SP 

-27 

-56 

2MK+AR+GS 

-20 

-53  1 

\ 


Table  3  (Continued) 


School 

Code 

N 

Criterion 

Validity  ;f  ASVAB  6/7 
Selector  Composite 
(WK+AD+NO) 

Most  Valid  Sets 

Of  Two  ASVAB  Tests 

Most  Valid  Sets  of 
Three  ASVAB  Tests 

r 

r 

r 

r 

r 

r 

u 

c 

u 

_ c 

u 

c 

CIA 

6020 

57 

DAYS 

-07 

-13 

NO+SP 

-13 

-17 

VK+SP+MK 

-08  -13 

WK+SP 

-13 

-16 

SP+MK+MS 

-08  -11 

SP+MC 

-12 

-14 

i 

* 

NOfWK 

-11 

-15 

YN 

6057 

212 

DAYS 

-07 

'  -11 

NO+SP 

-U 

-16 

NO+AD+SP 

-12  -15 

AD+SP 

-C9 

-12  , 

Validitiy  of  ASVAB  6/7 

Selector  Composite 

Most  Valid  Sets 

Most  Valid  Sets  of 

School 

Code 

N 

Criterion 

(WK+MC) 

Of  Two 

ASVAB  Tests 

Three  ASVAB  Tests 

r 

r 

r 

r 

r 

r 

u 

c 

u 

c 

u 

c 

AMH 

6517 

78 

FSC 

53 

74 

AR+MC 

51 

72 

WK+MC+SI 

63 

79 

AR+AI 

51 

•>o 

AR+EI+MC 

59 

76 

MK+AI 

49 

67 

AR+MC+SI 

57 

75 

MK+EI 

48 

65 

AR+MC+AI 

57 

75 

SI+AI 

48 

64 

MK+EI+HC 

57 

75 

MK+MC 

47 

69 

MK+MC+SI 

57 

75 

HK+MC+AI 

56 

75 

AltS 

6513 

89 

FSC 

40 

68 

MK+AI 

58 

76 

AR+MK+CS 

56 

75 

AR+MK 

55 

74 

2MK+.vR+GS 

55 

74 

AR+AI 

54 

74 

MK+MC+SI 

52 

73 

MK+MC 

53 

73 

MK+MC+AI 

50 

72 

AR+MC+SI 

49 

72 

AR+MC+AI 

49 

72 

AR+CS+AI 

49 

72 

WK+MC+SI 

47 

71 

Validity  of  ASVAB  6/7 

Selector  Composite 

Most  Valid  Sets 

Most  Valid  Sets 

of 

School 

Code 

N 

Criterion 

(AR+SI) 

Of  Two 

ASVAB 

Tests 

Three  ASVAB  Tests 

r 

r 

r 

r 

r 

r 

u 

c 

u 

c 

u 

c 

QM 

6001 

65 

FSG 

67 

84 

WK+AR 

79 

89 

VK+AR+MC 

76 

88 

AR+MK 

72 

85 

WK+AR+SI 

76 

88 

WK+MX 

69 

82 

WK+AR+GI 

73 

86 

VKPMC 

68 

83 

AD+WK+AR 

73 

85 

AR+MC 

67 

84 

AR+EI+GS 

70 

85 

AR+AI 

67 

85 

WK+MC+SI 

68 

84 

N'O+AR 

65 

80 

2MK+AR+GS 

68 

83 

AR+MK+GS 

71 

85 

AR+ET+MC 

68 

84 

Notes:  1.  Decira.il  points  are  omitted  from  correlations. 

2.  A  negative  validity  is  expected  for  the  DAYS  criterion. 


Validities  of  Operational  and  Selected  Alternate  Composites 
in  4  Ratings  Presently  Using  Electromechanical  Selectors 
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Table  5 


A-School*  Other  Than  Those  Originating  With  BE/E  School  For 
Which  Selector  Composite  Changes  Are  Proposed 


Present  Validity  Proposed  Validity 


Rating 

Code 

Criterion 

Selector 

r 

u 

t 

c 

Selector 

r 

u 

r  * 

t  I 

Fersonnelman 

PN 

6102 

DAYS 

WK+AR 

-11 

-18 

HOfWK 

-22  -26  j' 

Boiler  Technician 

BT 

6260 

DAYS 

WK-HiC+St 

-14 

-19 

mk+ai 

-25  „  -35  | 

Enginetcan 

EN 

6201 

D.-iYS 

WK+MC+SI 

-21 

-37 

MK+AI 

-24  -39 

Machinists  Mate 

MM 

6262 

‘  DAYS 

WK+MC+SI 

-35 

-53 

MK+AI 

-34  -53 

Aircrew  Survival 

PR 

6519 

DAYS 

WK+MC+SI 

-02. 

-04 

MK+AI 

-19  -16  i 

Equipeientnan 
Aviation  Structural 

AMH 

6517 

FSG 

WKmc 

53 

74 

AR+MC+AI 

57 

75  n 

Mechanic,  Hydralics 
Aviation  Structural  AMS 

651S 

FSG 

MKt-MC 

40 

68 

AR+HC+AI 

49 

CM 

Mechanic,  Structures  * 

Quartermaster  CM  6001  FSG _ AR»Sf  67  S4  lVk*AR  70  89  ! 


ASSESSMENT  OF  ARMED  SERVICES  VOCATIONAL 
APTITUDE  BATTERY  VALIDITY 


Lonnie  D.  Valentine 
Air  Force  Human  Resources  Laboratory 
Personnel  Research  Division 
Brooks  Air  Force  Base,  Texas 


It  is  ay  first  intention  today  to  provide  you  a  capsule  description 
of  the  Air  Force  generated  validity  information  available  on  the  Armed 
Services  Vocational  Aptitude  Battery  (ASVAB)  along  with  appropriate 
references  to  such  studies  for  those  of  you  who  wish  or  need  aore 
detailed  data  from  the  studies.  I  shall  also  briefly  discuss  the  need 
for  validations  against  job  perforaance  and  suggest  one  possible  approach. 
Last,  I  shall  discuss  the  kinds  of  joint  service  test  validation  studies 
which  are  needed  and  describe  some  of  the  data  requirements  and  pitfalls 
associated  with  then. 

ASVAB  is  required  to  serve  a  broad  range  of  purposes  which  are  not 
always  coaipatlble  in  a  relatively  short  battery.  For  the  services,  it 
must  provide  a  measure  for  enlistment  qualification  (i.e.,  a  selection 
measure)  and  ip  addition  must  provide  the  initial  classification  measures 
required  by  the  various  services.  With  respect  to  classification,  tha 
philosophies  and  requirements  of  the  services  differ.  Moreover,  the 
battery  is  used  in  a  High  School  testing  program  which  covers  the  range 
of  grades  9  through  12.  For  this  purpose,  maximisation  of  differential 
aptitude  information  is  of  prime  importance.  By  contrast,  the  service 
programs  require  balance  between  maximum  prediction  and  maximum  differ* 
entiation  between  aptitude  areas.  I'll  return  to  this  later. 

In  1973,  Vltola,  Mullins,  and  Cross  reported  on  validity  for  predic- 
tion  of  Air  Force  technical  training  grades  of  Air  Force  classification 
composite*  derived  from  ASVAB,  Form  1,  and  compared  validity  of  rheae 
compositeo  with  that  of  their  counterparts  from  the  Mrman  Qualifying 
Examination  (AQE)  (see  AFllRL-TR >73-7) .  Their  subjects  had  been  enlisted 
on  the  basis  of  AQE  and  were  administered  ASVAB *1  at  Lacklnnd  Air  Force 
Base  on  their  sixth  day  of  Basic  Military  Training. 

They  found  that,  in  general,  the  obtained  validity  of  ASVAB  composites 
was  equal  to  or  slightly  higher  than  that  of  their  AQE  counterparts;  the 
small  discrepancy  in  favor  of  the  ASV*B  composites  probably  waa  an  artifact 
of  assignment  to  training  courses  (and  consequently  direct  restriction  of 
range)  on  the  AQE  compost  ten .  They  also  found  that,  for  most  .oursss,  fJ*e 
relevant  ASVAB  composite  was  more  valid  for  that  course  Chan  were  the  other 
three  composites.  Validities  from  this  study  are  suasutrired  in  Tables  1 
through  4  of  the  haadout. 
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Valentine  ( 19/7)  reported  a  large  scale  validation  of  ASVAB-3  Air 
Force  composites  against  Air  Force  technical  training  grades.  The 
objectives  of  the  study  were  to  (a)  investigate  validity  of  the  Armed 
Services  Vocational  Aptitude  Battery  and  of  educational  background  data 
for  Air  Force  technical  training,  (b)  investigate  unique  predictive  con¬ 
tribution  of  both  educational  background  and  test  data  for  Air  Force 
technical  training  success,  and  (c)  assess  homogeneity  of  prediction 
equations  for  subgroups  defined  by  sex  and  race. 

ASVAB-3  data  were  collected  on  Air  Force  non-prior  service  enlisted 
accessions  for  September  1973  through  October  1975.  Analyses  were  conducted 
on  43  clusters  of  enlisted  technical  courses;  formation  of  these  clusters 
was  based  on  a  count  of  available  cases  in  various  courses  and  clustering 
together  of  those  which  were  judged  to  be  similar.  The  major  criterion 
variable  was  final  course  grade. 

Variables  used  in  the  study  were  (a)  an  Armed  Forces  Qualification 
Test  (AFQT)  ncore  and  four  Air  Force  Aptitude  Indexes  (Mechanical, 
Administrative,  General,  and  Electronics)  all  derived  from  ASVAB-3,  (b)  a 
series  of  41  binary  variables  indicating  successful  completion  or 
non-completion  of  specific  high  school  courses,  (c)  disposition  from 
training  (graduation  oi  elimination),  (d)  final  course  grade,  (e)  ethnic 
identity  (Caucasian,  Black,  or  Other  Minority),  (f)  sex  (male  or  female), 
and  (g)  course  cluster  identity. 

Half  of  the  male  Caucasians  in  each  of  the  43  course  clusters  were 
randomly  selected  as  an  Educational  Index  (El)  development  sample.  The  EX 
development  sample  was  restricted  to  Caucasian  males  because.  In  most 
clusters,  inclusion  of  women  and  ethnic  minorities  would  have  reduced  their 
number  in  the  remaining  sample  below  a  desirable  number  for  the  analyses 
contemplated  for  It.  For  each  of  the  43  course  clusters,  an  Education  Index 
waa  developed  using  the  El  development  sample.  For  this  purpose,  the  sample 
was  divided  into  an  upper  and  a  lower  5QX  criterion  dichotomy  by  aaslgning 
all  fall  cases  to  the  lower  group  along  with  enough  of  the  cases  with  the 
lowest  final  course  grades  to  complete  50X  of  the  development  sample.  The 
41  course  completion  variables  were  item  analyzed  ag-inst  this  course 
success  dichotomy.  Those  courses  with  significant  positive  correlation  vith 
the  criterion  were  assigned  sn  El  scoring  weight  of  +  1  while  those  with 
significant  negative  correlation  with  the  criterion  were  assigned  a 
scoring  weight  of  -  1 ,  The  El  development  samples  were  excluded  from  all 
subsequent  analyses;  thus,  all  validities  ‘ported  in  the  study  represent 
cross -validation  values. 

Table  5  of  the  handout  summarizes  the  areas  for  which  analyses  were 
accomplishad  and  shows  the  number  of  cases  available  in  the  cross-validation 
samples.  For  these  samples,  Table  o  shows  validity  of  the  AFQT,  the  four 
Air  Force  aptitude  composites,  and  the  Education  Index  against  final  course 
grade.  Table  7  summarizes  tests  of  hypotheses  about  independent  contribution 
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to  prediction  of  test  and  of  educational  background  information.  Both  teat 
and  educational  background  data  demonstrated  usefulness  for  prediction  of 
technical  training  performance;  moreover,  when  used  in  combination  with 
each  other,  more  accurate  predictions  are  achieved  than  through  the  use 
of  either  alone.  Generally,  of  the  tvo  kinds  of  data,  test  data  alone 
provided  more  accurate  predictions  than  did  educational  data  alone,  and, 
moreover,  introduction  of  test  data  to  an  equation  based  on  educational 
background  provided  a  larger  increase  in  prediction  accuracy  than  was 
achieved  with  introduction  of  educational  background  into  a  test -based 
prediction  equation.  These  observations  also  hold  for  prediction  equations 
bayed  on  specific  race  or  sex  subsamples. 

To  test  hypotheses  about  homogeneity  of  separate  race  or  sex  regression 
equations,  a  series  of  regression  problems  involving  race  membership,  sex 
membership,  AFQT,  the  Selector  AI,  the  Educational  Index,  and  Interactions 
of  race  or  sex  with  the  test  and  educational  variables  as  predictors  of  final 
course  grade  were  computed  and  compared  via  the  F  statistic.  Tables  8  and  9 
summarize  these  hypotheses  and  the  tests  of  them.  While  not  tabled  in 
your  handout,  these  same  hypotheses  were  also  tested  for  the  Educational 
Index  end  for  the  test  variables  separately. 

In  many  instances,  separate  race  or  sex  prediction  equations  are  not 
homogeneous  (l.e.,  the  subgroup  equations  differ  from  each  other  enough 
that  added  accuracy  in  prediction  is  achieved  by  using  a  separate  equation 
for  each  subgroup);  this  observation  is  more  often  true  for  race  based  sub¬ 
groups  and  for  predictions  baaed  on  educational  background  data.  In  all 
but  two  instances,  there  were  significant  differences  in  the  separate  race 
equations  for  predicting  technical  training  performance  from  educational 
background.  In  most  Instances,  the  data  suggest  that  differences  in  race- 
based  prediction  equations  are  attributable  to  the  equations*  intercepts; 
that  is,  while  usually  the  predicted  technical  training  grade  increases  for 
each  subgroup  by  about  the  same  amount  for  each  increase  of  one  score  unit 
on  the  predictor,  the  "constants  added  into  the  equations  differ.  This 
results  in  parallel  prediction  lines  for  the  subgroups  which  differ  mainly 
$n  level. 

Table  10  of  your  handout  demonstrates  the  impact  of  these  equation 
differences.  This  table  was  developed  from  separate  subgroup  (i.e., 
Causaslan,  Plack ,  male,  female)  regression  equations  for  predicting  training 
performance  from  test  and  educational  background  data.  From  this  table  it 
can  be  seen  that,  when  total  group  means  on  the  selector  AI,  AFQT,  and  El 
are  substituted  into  the  Black  and  Caucasian  equations ,  a  lower  criterion 
value  is  predicted  by  the  Black  equation.  Thus,  when  a  single  overall 
equation  is  used,  the  tendency  would  be  to  predict  higher  Black  criterion 
performance  than  is  observed.  A  single  overall  equation  tends  to  under¬ 
predict  female  criterion  performance  in  Food  Service,  Administrative,  and 
Medical  specialties  and  to  overpredict  for  them  in  Mechanical  specialities 
and  in  Law  Enforcement. 
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One  validation  study  has  been  accomplished  specifically  for  the  benefit 
of  the  high  school  testing  program.  This  provided  validity  information 
against  high  school  vocational-technical  curriculum  grades  (Jensen  & 

Valentine,  1976).  The  sample  for  chla  study  consisted  of  approximately 
4,300  high  school  students  primarily  from  the  northeastern  sector  of  the 
country.  Each  of  these  students  was  enrolled  in  one  or  another  of  41 
different  high  school  vocational -technical  courses.  Each  student  had  been 
administered  ASVAB-2  during  the  1973-74  school  year.  High  schools  partici¬ 
pating  in  the  study  provided  the  final  course  grade  for  each  student  for 
the  vocational -technical  course  in  which  the  student  was  enrolled.  For 
each  of  the  41  high  school  vocational -technical  courses,  validities  of  the 
nine  subtest?  of  ASVAB-2  against  the  course  grsde  were  computed  and  multiple 
correlations  against  that  same  criterion  were  obtained.  Table  11  presents 
the  average  validities  of  the  ASVAB  subtests  for  vocational -technical  courses 
judged  to  be  subsumed  under  the  four  aptitude  areas  used  in  the  Air  Force 
classification  program.  For  these  41  vocational-technical  courses,  multiple 
correlations  of  the  nine  ASVAB  subtests  against  course  grsde  ranged  from 
.30  to  .93  with  a  median  of  .54. 

During  the  spring  of  1977,  new  aSVAB  composites  for  the  High  School 
Testing  program  were  developed  for  school  year  1977-78.  These  coupositea 
were  based  on  an  oblique  factor  analysis  of  the  subtests  of  the  battery  and 
are  believed  to  be  more  representative  of  dimensions  of  human  ability  than 
were  the  composites  which  they  replace.  A  primary  goal  in  developing  thewe 
new  composites  was  to  provide  a  set  of  scores  for  high  school  counseling 
which  provide  better  differentation  among  abilities  than  was  available  from 
the  previous  composite  set  reported  to  the  schools  by  MEPCOM. 

For  an  unpublished  study  (Valentine  and  Mathews) ,  a  USAREC  computer 
tape  file  of  data  on  AFEES  testing  accomplished  1  April  1976  thru  31  March  1977 
was  obtained  from  the  Defence  Manpower  nnalysls  Center.  This  file  identifies 
individual*  processed  at  the  AFEES ,  indicates  the  service  for  which  they 
were  processed,  identifies  the  specific  test  or  tests  (by  form)  administered 
to  them,  and,  In  the  case  of  ASVAB,  records  all  subtest  raw  scoreu.  A 
subfile,  developed  from  the  larger  file,  consisted  of  individuals  AFEES 
processed  for  the  Air  Force  who  vere  administered  Form  5,  6,  or  7  of  ASVAB. 

The  file  was  further  reduced  by  deletion  of  cases  on  whom  identification  data 
was  insufficient  for  collation  with  other  files  or  for  whom  test  score 
data  in  the  file  was  Incomplete;  effectively,  this  was  a  file  consisting  of 
Air  Force  personnel  who  were  administered  Form  6  or  7  of  the  ASVAB. 

The  resulting  file  was  matched  against  the  Air  Force  Technical  Training 
file  to  obtain  identity  of  technical  training  courses  and  final  course 
grades.  This  collated  file  was  subdivided  on  the  basis  of  ASVAB  form 
taken  (6  or  7)  and  technical  training  courses  completed.  Analyses  were 
accomplished  separately  on  resulting  samples  whan  N  was  squal  to  or  greater 
than  50.  For  Form  6,  10  such  samples  were  available  and  for  Form  7,  16 
samples  vere  available. 
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For  each  case,  raw  scores  for  the  High  School  composites  in  use  during 
the  76-77  school  year  and  for  th^  High  School  composites  proposed  for  school 
ygar  77-78  were  computed.  In  etch  of  the  26  available  samples,  correlations 
of  these  two  sets  of  composite  scores  with  final  course  grade  were  computed. 

Table  12  lists  the  16  technical  courses  fur  which  validation  samples 
were  available;  the  first  column  of  Table  1  shows  the  AFSC  associated  with 
the  course,  column  2  shows  the  associated  job  title,  while  the  last  column 
indicates  the  Air  Force  composite  and  percentile  normally  required  for 
assignment  to  the  course. 

Table  13  summarizes  validity  and  composite  intercorrelatlonal  informa¬ 
tion  for  both  the  current  and  proposed  High  School  composites  for  ASVAB 
Form  6  samples,  while  Table  14  provides  a  similar  sunsrary  for  Form  1  samples. 
For  each  sample,  the  tables  show  the  number  of  cases  lr<  the  sample,  the 
range  of  composites  vs.  final  course  grade  correlations  obtained  for  each 
set  cf  composites  separately,  and  the  median  intercorrelation  within  each 
set  of  composites.  It  should  be  noted  thau  correlational  values  presented 
are  obtained  values;  they  have  been  corrected  for  range  restriction. 

The  most  immediate  conclusions  from  this  data  are  that  proposed  High 
School  ASVAB  composites  (a)  should  prove  an  useful  as  the  current  aet  for 
success  predictions  and  (b)  should  ^ove  more  useful  than  the  present  aet 
for  counseling  use,  both  in  terms  of  greater  spread  of  validities  among  the 
composites  (therefore  providing  easier  identification  of  one  or  two  relevant 
composites)  and  in  terms  of  greater  differentiation  in  ability  patterns  for 
individual  subjects  (as  reflected  in  lower  intercor relations  among  the 
composites  in  the  set)  . 

1  mentioned  earlier  that  ASVAB  is  expected  to  eerve  a  variety  of  uses 
which  necessitate  the  battery's  use  with  examinees  ranging  from  ninth  graders 
through  seniors  and  young  adults  applying  for  service  enlistment.  A  great 
deal  of  evidence  suggests  that  there  is  a  large  difference  in  difficulty 
of  test  material  between  the  tenth  and  eleventh  grades.  Thus,  tests  which 
are  '  ‘easy '  *  enough  Cor  ninth  and  tenth  graders  are  too  easy  for  the  other 
groups,  and  tests  which  are  of  appropriate  difficulty  for  eleventh  and 
twelfth  graders  are  far  too  hard  for  ninth  and  tenth  graders.  Moreover, 
the  various  services  have  different  cut-off  requirements.  In  a  20-item 
scale,  it  In  near  impossible  to  balance  these  varied  difficulty  require¬ 
ments.  This,  in  turn,  impacts  on  the  amount  of  chance  (or  unreliability) 
variance  for  lnss  mature  subjects  and  unduly  restricts  variance  among  the 
more  mature  subjects.  This,  in  turn,  will  tend  to  limit  the  battery's 
validity.  One  solution  to  this  problem  would  be  production  of  the  battery 
in  ‘ ‘easy ’ •  forms  appropriate  for  ninth  and  ienth  graders  and  some  of  the 
less  demanding  service  needs,  and  “hard"  forma  appropriate  for  eleventh 
and  twelfth  grade  uje  and  more  demanding  service  selection  and  classification 
needs.  Judicious  normative  procedures  could  calibrate  the  two  versions  to 
service  normative  standards  with  ■  region  of  normative  overlap.  This  would 
effectively  double  battery  length  and  provide  more  reliable  test  data 
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provided  procedures  (auch  as  a  shore  version  placenent  scale)  are  employed 
for  service  applicants. 

Several  years  ago,  I  cried  out  an  Idea  for  test  validation  against 
operational  criteria  which  I  believe  aay  have  utility  in  evaluating  the 
battery  against  * 4 job  performance. ’ *  I  did  not  formally  report  the  effort 
or  attempt  to  replicate  it  on  later  samples  because  changes  in  the  standards 
and  criteria  by  which  Air  Force  career  progress  occurs,  instituted  at  about 
that  time,  confound  the  criterion.  However,  there  may  be  places  within  the 
other  services  where  the  approach  is  feasible.  At  that  tiM,  Air  Force 
promotion  opportunity  was  liaised  by  such  factors  as  time  in  service, 
specialty,  and  skill  level.  Working  from  the  Uniform  Airman  Record  file, 

I  sorted  several  of  the  more  populous  specialties  into  homogeneous  sub* 
samples  with  respect  to  these  factors.  Z  then  tested  for  significant 
differences  in  mean  selector  aptitude  index  for  higher  and  lower  ranking 
personnel  within  these  year  and  skill  level  groups.  Table  15  summarises 
outcomes  for  Security  Police;  similar  results  were  obtained  for  the  several 
other  specialties  examined.  One  can  assume  that  supervisors  tend  to  work 
for  promotion  of  their  most  capable  workers  first.  These  data  certainly 
provided  evidence  that,  other  things  being  equal,  the  higher  aptitude  per* 
sonnel  are  the  first  promoted.  For  the  benefit  of  any  of  you  who  may  feel 
uneasy  about  this  * 'backwards* •  sort  of  application  of  the  F  ratio,  let  me 
point  out  that,  in  this  instance,  F  »  t2.  Certainly  in  our  future  joint 
ASVAB  validation  efforts,  approaches  o?  this  sort  might  bo  considered  when¬ 
ever  we  locate  samples  whose  promotion  is  not  contingent  on  application  of 
a  promotion  score  equation. 

There  is  continuing  concern  with  DoD  for  development  of  a  single  set 
of  ASVAB  composite  scores  with  applicability  across  programs.  Analyses, 
designed  to  establish  feasibility  of  a  single  composite  set,  are  being 
designed  at  the  present  time  with  ASVAB  Forms  6  and  7.  Essentially,  these 
will  involve  cross-service  application  of  current  sets  and  examination 
of  alternatives  tc  these.  These  later  analyses,  involving  establishment 
of  regression  equations  for  various  service  schools  from  subtest  information 
and  assessment  of  cross -service  homogeneity ,  will  entail  a  number  of  special 
problems  which  will  require  careful  control.  Among  these,  extent  to  which 
service  training  criteria  have  the  same  meaning,  even  for  essentially 
identical  jobs,  is  unknown.  Thus,  we  anticipate  that  these  analyses  will 
be  relatively  complex. 

In  summary,  most  Air  Force  studies  indicate  useful  ASVAB  validity  for 
training  criteria,  hext,  validation  efforts  will  explain  extent  to  which 
a  single  set  of  composites,  adequately  responsive  to  selection  and  classi¬ 
fication  strategies  of  all  the  services,  are  possible. 
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TABLE  5.  Witbi* Jinmple Sim 


Otiw 

Jab  Am* 

Cat:* 

rtlci 

N 

otnar 
Man.  n 

Mala 

N 

Pa  mats 

M  ' 

Taaat . 

til 

Intel  ipciKc  ( 20X30) 

24  S 

V) 

235 

55 

290 

02 

Audiovisual  (23X  30) 

171 

43 

- 

183 

31 

214 

03 

Weather  (25X3X), 

317 

55 

278 

96 

374 

04 

Command  Control  System*  Operator 
(27X3  X) 

664 

230 

■*90 

115 

905 

05 

Communications  Opera  tioiw(29t30) 

369 

195 

4l,’> 

158 

567 

Oh 

Communkatiom-t'kt'tniiiVa  Systems 
(30X3X) 

tv849 

181 

$3 

1.740 

343 

•  2,083 

07 

Missile  electronic  Maintenance 
(3IX3X) 

544 

53 

517 

95 

612 

08 

Avionics  Systems  (32X3X) 

7,163 

244 

57 

2,014 

v'50 

2.464 

09 

TraWn*  Devices  (3«X3X) 

170 

- 

-  - 

158 

C 

178 

10 

Wire  Communications  Systems  Maintenance 
(361/3X0) 

226 

66 

303 

303 

tl 

Wire  Communications  System*  Maintenance 
(362X0) 

224 

69 

— , 

287 

302 

12 

Intricate  Equipment  Maintenance 
(40X3X) 

75 

24 

101 

103 

13 

AircraTt  Accessory  Maintenance 
(42X3X) 

.,598 

1.04 1 

98 

2,117 

550 

2,7,17 

14 

Aiicnft  Accessory  (43130) 

193 

- 

- 

177 

44 

....  Ml 

15 

Akenft  Maintenance  (431 31) 

4.S59 

1,073 

104 

4,(68 

1.268 

5,736 

Id 

Aircraft  Engineer  (4323X) 

1,356 

363 

44 

1.431 

332 

1.763 

17 

Miaul*  Maintenance  (44X3X) 

241 

52 

259 

36 

295 

IS 

Munitions  and  Weapons  Maintenance 
(46130) 

837 

162 

i,a* 

1,001 

19 

Munition*  ard  Weapons  Maintenance 
(46230) 

912 

154 

**» 

1.CS4 

1.084 

20 

MufdtioM  and  Weapon*  Maintenance 
(46330) 

194 

208 

•w 

209 

21 

Vehicle  Maintenance  (47X3X) 

251 

28 

- 

262 

- 

282 

22 

Computer  Systems  (51 X3X) 

ri 

- 

- 

183 

86 

269 

23 

Metal  Working  (53X3X) 

65S 

160 

- 

659 

168 

827 

24 

Mechanical/ Electrical  (S4X3X) 

831 

297 

- 

970 

181 

MSI 

25 

Structurai/PavemenU  (S5X3X) 

505 

75 

- 

471 

119 

590 

26 

Sanitation  (56330) 

215 

36 

- 

251 

2S1 

27 

lift  Protection  (571 30) 

507 

188 

- 

709 

- 

711 

28 

Fabric  and  Rubber  Froducti 
(58X30) 

178 

42 

194 

29 

223 

29 

Tran*  partition  (60X3X) 

1.106 

400 

40 

1,346 

200 

1.546 

30 

Food  Service  (62X3X) 

256 

136 

- 

264 

117 

401 

31 

Fuel  Services  (63 1 30) 

367 

26i 

644 

- 

646 

32 

Inventory  Man^ement  (64530) 

1,199 

587 

83 

1,313 

556 

1,869 

33 

Material  Facilities  (64730) 

481 

360 

- 

541 

317 

•SI 

34 

Accounting!  and  Finance,  and  Audit  in* 
(67X3X) 

439 

too 

372 

179 

551 

35 

Admlnhtiation  (70X3X) 

1,503 

1,078 

56 

1,716 

921 

2,637 

36 

Personnel  (732JO) 

453 

180 

- 

463 

183 

641 

37 

Security  Mice  (81130) 

2.172 

1.222 

44 

3,431 

3*438 

38 

Law  Enforcement  and  Corrections 
(81230) 

1,078 

256 

-M-. 

900 

448 

1,348 

39 

Medical  (90010) 

934 

404 

28 

912 

454 

1,366 

40 

Medical  (90X3X) 

*,385 

470 

41 

1,283 

620 

1JWJ 

41 

Medical  (9IX3X) 

249 

48 

- 

251 

49 

300 

42 

Aircrew  Protection  (92230) 

332 

63 

- 

339 

63 

402 

43 

Dental  (98X3X) 

241 

61 

- 

212 

108 

320 

'Rite  N'l  or  Sex  N't  do  not  nectsaaiiy  tquil  mul  N.  TUi  ■  bttwK  the  tubaampk  N's  M  shown  «*t y  for  wW 
um^i  with  24  or  more  curt  on  which  within  rub*  *m  pie  vdiditiet  were  computed. 
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Table  6.  Educational  Index  and  ASVAB  Composite  Validities 
Against  Final  School  Grade 


'  XsVAfe  Composite 

Educ 


Group 

Index 

AFQT 

Mech 

Adm 

Gen 

Elect 

01 

.38 

.42 

.25 

.30 

.40 

.37 

02 

.40 

.26 

.30 

.41 

.33 

.35 

03 

.25 

.38 

.28 

.22 

.28 

.37 

04 

.22 

.38 

.28 

.14 

.35 

.34 

05 

.26 

.29 

.21 

.25 

.34 

.27 

06 

.28 

.34 

.23 

.21 

.34 

.44 

07 

.30 

.37 

.22 

.23 

.31 

.45 

08 

.27 

.29 

.22 

.21 

.29 

.33 

09 

.32 

.32 

.26 

.37 

,33 

.32 

10 

.23 

.26 

.29 

.23 

.31 

.32 

11 

.20 

.30 

.20 

.23 

.26 

.25 

12 

.40 

.45 

.40 

.55 

.50 

13 

.26 

.31 

.40 

.18 

.31 

.36 

14 

.31 

.43 

.45 

.25 

,30 

.50 

15 

.24 

.32 

.34 

.18 

.31 

.36 

16 

.32 

.42 

.43 

.33 

.40 

.46 

17 

.19 

.34 

.29 

.26 

.28 

.29 

18 

.21 

.32 

.34 

.23 

.32 

.32 

19 

.22 

.37 

.27 

.22 

.34 

.37 

20 

.45 

.42 

.42 

.32 

.42 

.46 

21 

,26 

.40 

.53 

.25 

.39 

.51 

22 

.13 

.32 

.05 

.27 

.26 

.24 

23 

.24 

.36 

.24 

.25 

.30 

.34 

24 

.18 

.36 

.40 

.21 

.35 

.33 

26 

.16 

.24 

.34 

.12 

.17 

.26 

26 

.37 

.36 

.45 

.33 

.39 

.41 

27 

.20 

,28 

.34 

.26 

.23 

.30 

28 

.28 

.28 

.41 

.19 

.09 

.25 

29 

.28 

.43 

.23 

.20 

.38 

.35 

30 

.09 

.10 

.03 

.12 

-.04 

.03 

31 

.15 

.29 

.39 

.19 

.26 

.34 

32 

.27 

.30 

.18 

.13 

.34 

.29 

33 

.17 

.29 

.17 

.19 

.26 

.25 

34 

.25 

.41 

.27 

.03 

.43 

.41 

35 

.23 

.32 

.16 

.20 

.32 

.27 

36 

.33 

.50 

.25 

.24 

.46 

.41 

37 

.24 

.30 

.29 

.23 

.21 

.28 

38 

.30 

.38 

.32 

.26 

.39 

.39 

39 

.32 

.42 

.29 

.28 

.34 

.40 

40 

.33 

.42 

.30 

.28 

.38 

.38 

41 

.31 

.35 

.21 

.25 

.37 

.30 

42 

.18 

.26 

.22 

.13 

.10 

.21 

43 

.39 

.43 

.28 

.39 

.43 

.38 

TABLE  7.  Validity  and  ContribstioB  to  Prediction  of  FJmI  School 
Grade  of  EdagtioMl  lackpow d  aad  T«t  Data 


(I)TmU' 

Vat 

Pfitt tor** 

(ItjTya 

Owjf 

oil)  at 

On* 

r  for  Cawtxmrtten  ft  . 

tmi  "  at 

01 

.54 

A1 

-38 

30.5.3 

29.86 

02 

.47 

36 

.40 

7.69 

23.76 

03 

.46 

.40 

.25 

33.80 

21.73 

04 

.42 

.40 

.22 

69jOO, 

1034 

05 

.41 

31 

32j65 

21.77 

06 

.49 

.46 

.28 

230.  J  7 

92.73 

07 

.50 

.48 

.30 

6676 

20.22 

06 

.40 

.36 

.27 

126.45  ' 

8138 

09 

.43 

31 

32 

8.66 

9.57 

10 

.40 

34 

.23 

19.00 

15.55 

11 

37 

32 

.20 

16.07 

10.47 

12 

.59 

.54 

.40 

12.59 

8.17 

13 

.46 

.44 

.26 

251.70 

54.01 

14 

.56 

.55 

31 

35.03 

6.03# 

IS 

.43 

.42 

24 

463.23 

106.63 

16 

.54 

.51 

.32 

242.55 

78.11 

17 

.41 

.40 

.19 

23.40 

4.89* 

18 

.45 

.42 

.21 

98.74 

27.64 

19 

.42 

.40 

.22 

86.99 

25.00 

20 

.55 

.48 

.45 

14.16 

20.30 

21 

.58 

.57 

.26 

56.23 

6.75 

22 

35 

32 

.13 

15.18 

4.17 

23 

.41 

3b 

.24 

57.13 

27.77 

24 

.50 

.49 

.18 

162.42 

17.19 

25 

.38 

3« 

.16 

41.07 

3.43b 

26 

.54 

.49 

.37 

27.40 

19.82 

27 

.32 

.29 

.20 

23.79 

15.15 

28 

.45 

.42 

.28 

16.63 

562* 

29 

.48 

.44 

.28 

153.82 

8037 

30 

.18 

.14 

.09 

4.70 

4.66* 

31 

.32 

3 1 

.15 

28.99 

S.00* 

32 

38 

.32 

.27 

81.10 

96.92 

33 

.33 

.30 

.17 

36.22 

12.98 

34 

.42 

.41 

.25 

38.19 

6.52w 

35 

.37 

34 

.23 

129.97 

69.56 

36 

.54 

.51 

.33 

86.72 

29.41 

37 

36 

.31 

.24 

136.60 

136.60 

38 

.46 

.42 

.30 

102.71 

51.91 

39 

.49 

.43 

-32 

118.79 

92.71 

40 

.50 

.45 

.33 

176.04 

124.01 

41 

.46 

.40 

31 

,  20.70 

57.10 

42 

.31 

.27 

.18 

’4.70 

11.68 

43 

.54 

.49 

39 

31.57 

24.94 

*Prtdict or*  for  the  R’»  in  ths  column*  ate : 

I  *  AFQT,  Selector  Al.  and  Education  Index 
11- AFQTindVK'to*  Ai 
III  *  Education  Index  only. 

bNot  *ijniftc*nt.  All  ot>«t  F»  are  *%nlfkjnt  it  «  beyond  the  .01  level. 
*Si*rutkiM  at  the  .0$  but  not  at  the  .01  level. 


r„  Ttbk  8.  Jem  of  Hypotheses  re  Race 
Eqivty  ofEdgcHioMi  B«J(|romd  gad  Test 
D*U  Bated  fadictfoa* 


Oevwa 

«• 

P  far* 

• 

ii 

in 

,  H, 

01 

.54 

.57 

38 

2.09* 

.51 

02 

.47 

.51 

32 

1.80 

03 

.46 

.52 

.52 

4.05** 

.02 

04 

.42 

.43 

.43- 

1.95 

05 

.41 

.42 

.42 

.86 

' 

06 

.49 

.50 

.50 

3.61** 

139 

07 

.50 

31 

.S3 

2.48* 

1.97 

08 

.40 

.41 

.41 

5.02** 

1.13 

09 

.43 

.47 

.48 

1.25 

10 

.40 

.40 

.44 

137 

11 

.37 

39 

.42 

1.62 

12 

.59 

.60 

.62 

.77 

13 

.46 

.46 

.46 

1.69 

14 

.56 

.57 

39 

1.16 

15 

.43 

.46 

.46 

24.1 5** 

3.64** 

26 

.54 

.56 

36 

6.00** 

1.19 

17 

.41 

.44 

.45 

1.54 

13 

.45 

.46 

,47 

3.08** 

1.60 

19 

.42 

.44 

.45 

3  95** 

1.64 

20 

.55 

.58 

.61 

2.68** 

133 

21 

.58 

.59 

.60 

1.16 

22 

.35 

35 

38 

.94 

23 

.41 

.45 

.48 

7.1 5** 

3.77** 

24 

.50 

30 

30 

.89 

25 

.38 

.42 

.42 

2.92** 

33 

26 

.54 

38 

.58 

1.90 

27 

.32 

.43 

.43 

8.88** 

.06 

28 

.45 

.47 

30 

1.71 

29 

.48 

.50 

32 

1G.18** 

5.60** 

30 

.18 

.25 

.29 

2.87** 

133 

31 

.32 

.42 

.44 

8.70** 

139 

71 

.38 

.40 

.40 

4.7  !•* 

2.40* 

.33 

34 

36 

2.91** 

2.45* 

34 

.42 

.44 

.44 

130 

35 

.37 

39 

.40 

7.90** 

2.02 

36 

.54 

.57 

37 

3.89** 

1.44 

37 

.36 

.42 

.43 

28.87** 

1.96 

38 

.46 

.49 

30 

7.91** 

1.03 

39 

.49 

.55 

.56 

18.85** 

2.60* 

40 

.50 

35 

.56 

20.42** 

10.93** 

41 

.46 

,46 

.47 

.71 

42 

.31 

34 

36 

1.67 

43 

.54 

36 

.57 

1.60 

•ptcdktcxr*  in  the  four  model*  arc  I  *  AFQT,  Selector 
At,  Education  Index  (Problem  3);  II  -  Race,  AFQT,  Selec- 
tor  Al,  Education  Index  (Problem  8} t  lit  »  Race,  Race  * 
AFQT.  Race  x  Selector  At,  R’ee  x  Education  Index 
(Problem  9), 

^<3  *  Kmtvdfdjte  of  race  contribute*  nothing  to  ten 
and  El  Voted  prediction  of  final  ttbool  grade  (Problem  9 
vt.  PtoV'em  3).  »  Equation  dope*  are  homogencotu 

(ftoNcm  9  vt.  Problem  8). 

’Significant  at  the  .05  level. 

•’Significant  at  the  .0V  level. 


TABLE  9.  Tno  of  Hypothesmie  Sex  Eqrfty  . 
of  Edacatteagj  Bgcfcftoaad  id  Tot 
Data  ln(d  PndictioM 


•rem-'1' 

* 

;.r~ 

i 

ill 

•V 

0! 

.54 

35 

35 

.90 

02 

.47 

.47 

-47 -i , 

-  42  -  :,r 

03 

<46 

46 

47 

14)6, . 

04  ■ 

A% 

.42, 

42 

1.06.<  v 

05 

.41 

.41 

-41-,,. 

,-07,. 

06 

.49 

.50 

39. 

2.14  f-v 

07 

.50 

30 

31 

1.90 

08 

.40 

.40< 

41 

5  543**  621** 

09 

.43 

46 

47 

m  '*'>*** 

n 

37 

39 

.40 

242*  1.26 

12 

.59 

39 

.60 

;sl  .  ■ 

13 

.46 

.46 

.46 

2.08 

14 

.56 

37 

39 

2.77*  3.42* 

15 

.43 

.43 

.44 

i6.76**  2237**i 

16 

.54 

34 

.55 

7.79**  )0.14** 

17 

.41 

41 

.45 

2.72*  333* 

20 

55 

.58 

38 

2.99* 

22 

35 

36 

38 

1,87 

23 

41 

42 

.43 

3.62**  435** 

24 

.50 

.50 

31 

4.08**  4.82** 

25 

.38 

J8 

41 

347*  432** 

27 

32 

32 

32 

.92 

28 

.45 

45 

4!) 

2.70*  3.44* 

29 

.44 

.48 

.46 

50 

30 

.18 

30 

31 

6.80**  62 

31 

32 

32 

32 

02 

32 

38 

33 

39 

1.69 

33 

33 

33 

35 

4.68**  539” 

34 

.42 

.42 

44 

1.99 

35 

37 

38 

38 

5.99**  1j64 

36 

.54 

35 

35 

135 

38 

.46 

.47 

.47 

5.69**  1.04 

39 

A9 

30 

30 

4.92**  2.11 

40 

.30 

30 

30 

4.13**  133 

41 

.46 

.46 

48 

278 

42 

31 

34 

34 

1.71 

43 

.54 

35 

.56 

1.96 

•predictor*  in  the  four  model*  are:  I  *  Education  l.ndes, 
AFQT,  Selector  Al  (Problem  3) ill  *  Sex,  AFQT.  Selector 
Al,  Education  Index  (Problem  14);  HI  *  Sex,  Sex  r,  ,;,i  QT, 
Sex  r  Selector  Al,  Sex  x  Education  Index  (PRoWera  IS).. 

^Hg  *  Knowledge  of  ki  contribute.  not  blag  to  El  and 
ttrt  baaed  prediction*  of  final  uchoo?  grade  (Problem  15m. 
Problem  3).  Kg,  *  Equation  dope*  ate  homogeneous 
(Problem  14  v».  Problem  14). 

•Significant  at  the  .05  level 
••Significant  at  the  .01  level. 
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TABLE  10.  PREDICTED  CRITERION  SCORES  (ASSUMING  MEAN  PREDICTOR 
PERFORMANCE)  FOR  SELECTED  SUBSAMPLES* 


COURSE 

V 

V 

Y' 

Y* 

GROUP  # 

CAUC 

BLACK 

MALE 

FEMALE 

04 

86.47 

84.99 

86.05 

.  86.90 

05 

85.19 

83.84 

84. 68 

85.06 

06 

84.93 

82.82 

84.91 

84.16 

08 

84.24 

81.82 

33.93 

83.94 

13 

82.20 

81.42 

82.04 

82.53 

it 

84.01 

80.14 

83.21 

81.84 

16 

84.68 

81.91 

84.09 

82.41 

18 

89.06 

87.40 

• 

• 

19 

89.30 

86.86 

• 

• 

23 

84.15 

79.19 

83.27 

82.49 

24 

85.45 

80.32 

81.05 

79.30 

25 

- 

81.38 

80.05 

27 

86.88 

82.96 

- 

• 

29 

82.63 

79.44 

81.95 

82.55 

30 

87.43 

84.69 

85.42 

89.31 

31 

91.31 

87.62 

- 

• 

32 

84.32 

82.40 

83.56 

84.14 

33 

82.96 

81.26 

82.29 

82.83 

34 

80.46 

77.84 

80.03 

80.00 

35 

84.04 

82.19 

82.98 

84.03 

36 

87.07 

84.58 

86.24 

87.07 

37 

86.0C 

82.19 

• 

•* 

38 

83.44 

80.59 

83.36 

81.98 

39 

83.22 

77.94 

81.22 

82.75 

40 

82.01 

77.39 

80.55 

81.67 

43 

- 

- 

81.33 

83.17 

*  These  values  are  computed  only  for  subsamples  with  N£  100. 
TABLE  11.  Average  ASVAB  Subtest  Validities  Within  School 


Aptitude  Area 

CS 

WK 

AR 

TK 

SP 

HC 

SI 

A1 

PI 

Administrative 

.18 

.34 

.34 

-.07 

.12 

.18 

-.01 

.01 

.06 

Eiectronica 

.21 

.16 

.23 

.25 

.26 

.30 

.35 

.24 

.37 

Mechanical 

.15 

.16 

.22 

.30 

.24 

.27 

.26 

.26 

.25 

General 


28  .32  .32  .12  .31  .29  .18  .15  ,?4 


TABLE  12.  Validation  S^ples  of  Technical  Training  Courses 


AFSC 

Job  Title 

Selector _ 

27630* 

j  * 

Apr.  Aerospace  Control  6  Warning  8yataaa  Operator 

G**-60 

32632 

Apr.  Integrated  Avionic  Systeae  Specialist 

1-80 

42632* 

Apr.  Jet  Engine  Mechanic 

M/E-40 

43131* 

Apr.  Aircraft  Maintenance  Specialist 

M/I-50 

46130 

Apr.  Munitions  Maintenance  Specialist 

M/B-60 

46230* 

Apr.  Weapons  Mechanic 

M/E-60 

57130 

Apr.  Fire  Protection.  Specialist 

G-40 

60531 

Apr.  Air  Cargo  Specialist 

A-50 

63130 

Apr.  Fuel  Specialist 

G/M-40 

64530* 

Apr.  Inventory  Managaaent  Specialist 

A/G-60 

64531 

Apr.  Material  Facilities  Specialist 

A/G-60 

70230* 

Apr.  Adainistration  Specialist 

A-40 

73230* 

Apr.  Personnel  Specialist 

A-60 

81130* 

Apr.  Security  Specialist 

G-40 

81230* 

Apr.  Law  Enforceaant  Specialist 

C-50 

90230* 

Apr.  Medical  Services  Specialist 

G-60 

*ASVAE  Fora  6  saaples  wars  available  for  these  AFSCs  indicated  by 
an  asterink.  ASVAB  Fora  7  swaples  were  available  for  all  AFSCs  listed. 

**Coaposites  are  M  •  Mechanical,  A  •  Adainistrative,  C  -  General, 
and  E  »  Electronic;  M/E  indicates  either  M  or  S  nay  be  the  selector. 
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TABLE  13.  ASVAB-6  High  School  Composite  Validities 
cod  Intercorrelations 


Currant  Coepoeites 
Validity  Median 
Sange  Intercorralation 


Validity 


Median 

Intercorrelation 


26730 

61 

-.06/ .41 

.55 

-.18/41 

.32 

42632 

56 

.08/. 57 

.43 

.02/. 45 

.09 

43131 

341 

.32/. 48 

.62 

.16/. 46 

.32 

46230 

62 

.25/. 58 

.62 

.00/. 55 

.40 

64530 

134 

.20/. 52 

.59 

-.04/. 52 

.31 

70230 

no 

.19/. 39 

.60 

.00/. 44 

.36 

73230 

54 

.09/. 50 

.58 

-.13/. 50 

.33 

61130 

400 

.25/. 40 

.62 

.07/. 42 

.37 

81230 

239 

.27/. 50 

.61 

.10/. 50 

.36 

90230 

i02 

.23/. 49 

.80 

-.01/. 60 

.36 

TABLE  14.  ASV.U-7  Hifh  School  Coapoeite  Validitiea 
and  Intarcorralatlona 


AFSC 

N 

Currant  Coapositea 

Pronoaad  Coavoaltaa 

Validity 

Sanaa 

Median 

Intarcorralatlon 

Validity 

Sanaa 

Median 

Intercorrciation 

27630 

106 

.14/. 44 

.61 

-.01/. 44 

.27 

32632 

59 

.11/. 38 

.32 

.09/. 42 

.32 

42632 

133 

.24/. 47 

.58 

.21/. 40 

.30 

431il 

623 

.22/. 36 

.49 

.09/. 36 

.20 

^6130 

74 

.22/. 53 

.56 

.07/. 53 

.31 

46230 

178 

.38/. 55 

.63 

.18/. 49 

.38 

57130 

113 

.21/. 38 

.51 

.09/. 41 

.21 

60531 

75 

.42/. 69 

.61 

.34/. 60 

.45 

63130 

52 

.04/. 16 

.94 

.02/. 13 

.86 

64530 

251 

.23/. 47 

.60 

.05/ .47 

.39 

64531 

71 

.07/. 41 

.55 

-.05/. 41 

.31 

70230 

165 

.02/. 14 

.81 

-.08/. 16 

.66 

73230 

106 

.29/. 43 

.54 

.16/. 44 

.30 

81130 

642 

.23/. 31 

.69 

.18/. 31 

.45 

81230 

359 

.24/. 41 

.68 

.09/. 41 

.52 

90230 

147 

.30/. 42 

.86 

.28/. 40 

.73 

1100 


TABLE  15.  RELATIONSHIP  BETWEEN  GEN  AI  AND 
GRADE  FOR  SECURITY  POLICE 

SKILL  LEVEL  5 


N‘$  _  MEANS 


YEARS  OF 

E-4  AND 

E-4  AND 

SERVICE 

BELOW 

E-5 

BELOW 

E-5 

F 

04 

639 

140 

52.29 

61.07 

54.85** 

05 

234 

108 

53.35 

64.12 

41.34** 

06 

94 

104 

52.23 

58.37 

19.83** 

07 

94 

88 

54.52 

59.77 

6.76** 

08 

*3 

89 

51.59 

55.28 

3.38 

09 

35 

81 

47.43 

53.33 

8.23** 

10 

25 

54 

44.40 

53.70 

5.86* 

SKILL  LEVEL 

7 

* 

N'S 

MEANS 

YEARS  OF 

E-5  AMO 

E-6  AND 

E-5  AND 

E-6  AND 

SERVICE 

BELOW 

E-7 

BELOW 

E-7 

F 

09 

414 

26 

54.54 

54.60 

.16 

10 

440 

33 

53.94 

59.85 

5.40** 

11 

360 

58 

53.53 

56.38 

2.07 

12 

357 

92 

56.33 

63.04 

13.47"* 

13 

278 

104 

53.45 

60.58 

15.87** 

14 

306 

109 

50.02 

56.61 

12.01** 

15 

351 

191 

50.33 

57.85 

2i.ll** 

16 

6ci> 

396 

46,34 

56.86 

71.17** 

17 

289 

205 

44.74 

56.71 

44.46** 

18 

345 

317 

43.80 

53.94 

42.87** 

19 

466 

586 

44.86 

56.62 

87.97** 

20 

47 

213 

50.00 

59.44 

11.62** 

*  Significant  at  the  .05  level. 
**  Significant  at  the  .01  level. 
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A  JOINT  SERVICE  ACCESSION  TEST:  PROBLEMS  AND  PROMISE 


Steven  German 

Headquarters,  U.  S.  Marine  Corps 
Washington,  D.  C. 


Development  of  a  test  battery  which  has  several 
purposes  for  several  types  of  users  is  difficult  indeed. 
Add  to  this  situation  the  logistical  problem  of  gathering 
all  the  test  developers  and  users  at  a  central  location, 
include  a  tight  time  constraint,  and  you  have  an  idea  of  the 
environment  in  which  the  Armed  Services  Vocational  Aptitude 
Battery  (ASVAB)  was  developed.  The  ASVAB  currently  serves 
as: 


1)  a  counseling  instrument  for  high  schools 

2)  an  accession  tool  for  the  four  services 

3)  a  differential  classification  battery  for  the  four 
services 

The  users  are  the  thousands  of  schools  testing  millions  of 
high  school  students,  and  hundreds  of  Armed  Forces  Examining 
and  Entrance  Stations  (APEES) ,  raini-Ar'EES ,  and  Mobile 
Examining  Teams  (METs) ,  testing  millions  of  service 
applicants. 

The  high  school  composites  were  redesigned  for  thin 
present  school  year  to  achieve  maximum  differential 
classification  ability.  This  was  achieved  through  the  use 
of  factor  analysis.  The  resultant  composites  have  high 
factor  loadings  and  low  intercorrelations. 

Presently  the  four  services  have  three  metrics  for 
measuring  test  scores:  Navy  standard  score  (Mean  ■  50, 
standard  deviation  «  10),  Army  standard  score  (Mean  «  100, 
standard  deviation  »  20} ,  and  percentile  scores. 
Additionally,  each  service  has  its  own  compositing  formulas 
for  classification.  The  common  thread  among  all  services  is 
the  test  score  used  for  determining  mental  group  level,  the 
combination  of  the  word  Knowledge  (MK) ,  arithmetic  reasoning 
(AR),  and  space  perception  (SP)  subtests*  This  is  referred 
to  as  the  Armed  Forces  Qualification  Teat  (AFQT)  score,  a 
carryover  from  pre- ASVAB  testing. 
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One  problem  which  is  common  to  all  large  scale 
institutional  testing  programs  is  the  leakage  of  test 
questions/answers  to  examinees  prior  to  testing.  This 
phenomenon  also  occurs  in  the  military  testing  environment. 
The  occurrence  of  this  phenomenon  appears  to  be  much  more 
widespread  in  certain  recruiting  districts  than  in  others. 
For  managerial  purposes,  a  composite  was  developed  (Sims, 
1976)  using  non-AFQT  suotests  to  predict  A.VQT  and  General 
Technical  (GT)  scores. 

Hie  concept  underlying  this  composite  was  that  the  subtests 
which  are  vital  to  enlistment  are  most  apt  to  be  subject  to 
compromise.  With  this  composite,  averages  of  predicted  AFQT 
and  GT  scores  can  be  compared  with  the  averages  of  actual 
AFQT  and  GT  scores.  This  can  be  used  as  a  management  device 
to  detect  significant  differences  at  AFEES,  recruit ing 
districts,  recruiting  stations,  and  recruiters. 

The  data  sample  consisted  of  3,081  Marine  Corps 
recruits  who  were  tested  at  AFEES  on  ASVAB  form  3.  Upon 
arrival  at  the  recruit  depots  in  December  1975  and  January 
1976,  they  were  administered  ASVAB  form  6  or  7,  They  were 
also  administered  the  Army  Classification  Battery  (ACB-61) 
as  a  reference  test.  To  equalize  the  testing  effects,  a 
counterbalanced  design  was  used  wherein  half  the  recruits 
were  administered  ACB-61  first,  and  half  ASVAB  first.  The 
sample  was  weighted  to  approximate  the  normal  mobilization 
population.  Multiple  regression  analysis  was  conducted  to 
determine  the  subteats  that  were  the  best  predictors  of  the 
AFQT  and  GT  composites,  the  two  criteria  used  for  accession 
into  the  Marine  Corpr, 


Table  1  shows  the  results  of  the  analysis  of  raw  score? 
AFQT  and  GT  predictora,  the  standard  error  of  estimate,  and 
the  amount  of  variance  accounted  for  by  the  predictors.  It 
should  be  noted  that  the  correlation  of  the  prediction 
composite  with  AFQT  or  GT  is  comparable  to  the  correction 
between  alternate  AFQT  or  GT  composites, 

A  likely  procedure  for  reducing  test  compromise  would  be  to 
compute  the  AFQT,  GT,  predicted  AFQT  and  GT  score,  and 
retest  those  applicants  whose  predicted  and  actual  scores 
were  statistically  different  at  the  .10  confidence  level. 
The  retesting  would  be  conducted  on  an  alternate  AFQrt*  form, 
one  which  is  used  solely  for  test  score  verification. 
Several  test  forms  are  being  normed  at  AFEES  presently  to  be 
used  for  test  score  verification.  When  properly  normed, 
these  AFQT  subtests  could  be  used  with  this  procedure  to 
reduce  compromise.  Another  procedure  is  to  collect  several 
months  of  applicant  scores  broken  down  by  recruiters  and 
recruiting  stations,  and  determine  the  statistical 


significance  of  the  differences  between  true  and  predicted 
mean  score.  A  management  report  can  then  be  compiled  noting 
all  recruiters  and  recruiting  stations  where  the  mean  score 
differences  are  statistically  significant  at  the  .05  level. 

Another  problem  all  services  are  facing  is  that  of 
reducing  personnel  attrition  prior  to  the  expiration  of 
active  service  (Non-EAS  attrition) .  Recent  DOD  guidelines 
have  established  limits  on  the  percent  of  high  school  and 
non  high  school  graduates  who  will  be  permitted  to  be 
attrited  prior  to  EAS,  To  reduce  this  attrition,  the 
services  are  looking  at  biographical  data  and  attitudinal 
items  for  possible  inclusion  into  a  screening  test  at  AFEE&* 

A  study  was  conducted  for  the  Marine  Corps  (Sims, 1977) 
to  determine  the  variables  related  to  non-EAS  attrition.  As 
expected,  mental  ability,  age  at  enlistment,  level  of 
education  achieved,  and  number  of  dependents  were  all 
statistically  significant  variables.  Table  9  summarizes  the 
order  and  value  of  these  variables  entering  into  the 
regression  equation.  Tables  3  through  9  present  the 
predicted  chances  of  success  for  each  level  of  education, 
age,  and  mental  group.  Additionally,  however,  the  study 
examined  the  A5VAB  sub tests  to  determine  if  a  combination  of 
these  could  be  used  to  predict  attrition. 

Employing  the  data  derived  from  the  3,0 recruits 
tested  in  the  beginning  of  1976  less  those  recruits  who  were 
reservists,  a  multiple  regression  analysis  was  conducted 
with  all  ASVAB  subtests  using  the  criterion  of  being  in 
service  after  14  months.  Table  10  shows  the  order  and  value 
of  each  subtest  entering  the  multiple  regression.  The 
variables  which  are  significant  are  mostly  non-cognitive  or 
speeded  subtests,  along  with  educational  level  achieved. 
The  predictor  with  the  greatest  percent  of  variance 
explained  is  numerical  operations  (NO) ,  a  fifty  item  speeded 
test.  The  variable  which  enters  into  the  regression 
equation  next  is  the  combat  scale  (CC) ,  a  twenty-seven  item 
of  the  Army  Classification  Inventory  (ACI) .  The  ACI  also 
provides  the  attrition  composite  with  the  attentiveness 
scale  (CA) ,  a  twenty  item  interest  test.  Another  subtest  in 
the  attrition  composite  is  a  speeded  clerical  test, 
attention  to  detail.  Space  perception  is  the  only  wholly 
cognitive  non-epeeded  test  in  the  attrition  composite. 
Table  11  summarizes  the  regression  of  the  ASVAB  subtests  and 
education  onto  non-EAS  attrition.  Utilizing  the  attrition 
composite  rather  than  mental  ability  significantly  increases 
the  attrition  prediction.  Table  12  is  an  expectancy  table 
of  successful  completion  of  14  months  service  for  high 
school  graduates.  Table  13  for  non-high  school  graduates. 
Table  14  shows  the  cross-validation  of  the  traditional  and 
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non- traditional  composites,  as  well  as  the  composites 
presently  used  by  Navy  and  Marine  Corps, 

It  is  interesting  to  note  the  superiority  of  this  non- 
cognitive  composite  over  the  traditional  mental  ability 
screen.  The  difficulty  with  this  composite  is  that  the  two 
interest  scales  are  transparent.  For  this  reason ,  this 
composite  should  be  utilised  as  a  recruiting  tool,  rather 
than  a  mandatory  selection  criterion,  since  it  would  quickly 
lose  its  utility  during  periods  of  recruiting  shortfalls. 

Another  problem  with  most  standardized  testing  lies  in 
the  fact  that  the  same  static  measurement  instrument  must  be 
used  to  assess  a  wide  range  of  abilities.  This  procedure 
increases  the  testing  error  from  two  sources:  1)  the  test 
length  is  necessarily  long,  which  contributes  to  examinee 
fatigue;  2)  each  test  will  contain  questions  which  are  not 
at  an  appropriate  difficulty  level  for  every  examinee:  if 
too  easy,  the  examinee  will  become  bored  and  may  carelessly 
mark  incorrectly;  if  too  difficult,  the  examinee  will  guess, 
thus  increasing  test  noise. 

With  the  recent  enhancements  to  computers  and  the 
development  of  the  Owen  Bayesian  algorithm,  testing  of 
personnel  abilities  can  achieve  the  same  precision  of 
ability  estimation  with  a  minimized  number  of  items.  The 
Civil  Service  Commission  (Drry,1975)  has  demonstrated  that 
with  an  on-line, real- time,  adaptive  testing  sequence,  the 
same  precision  of  ability  estimation  of  a  conventional  test 
could  be  achieved  with  an  average  of  only  one-fifth  the 
number  of  test  items.  A  computerized  adaptive  testing  (CAT) 
program  at  AFEES  could  provide  many  benefits.  Some  of  these 
are: 


1)  Greater  test  precision  at  all  ability  levels, 
especially  at  the  tails  of  the  distribution 

2)  Improved  test  security 

3)  Decreased  misclassif ication 

4)  Reduction  of  examinee  anxiety  or  boredom 

5)  Reduction  of  test  length 

6)  Enhanced  applicant  motivation  with  immediate 
feedback  on  response  results 

7)  Standardized  test  administration 

8)  Improved  data  quality  through  elimination  of  human 
requirements  for  calculations  and  data  recording 
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9)  Interface  with  classification,  assignment,  and  job 
information  systems 

The  Navy  and  Marine  Corps  are  working  jointly  on  a  CAT 
project  (Gorman, 1977)  to  demonstrate  its  effectiveness 
within  a  military  environment.  The  project  involves 
psychologists  from  Headquarters,  Marine  Cor?  3  and  the  Navy 
Personnel  Research  and  Development  Center,  and  Marine 
recruits  located  at  San  Diego.  All  of  the  research  to  date 
is  begging  the  question  of  not  if,  but  when  CAT  should  be 
implemented  at  AFEES,  Computerised  adaptive  testing  gives 
great  promise  towards  more  precise  and  faster  ability 
measurement. 


< 
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TABLE  1 

VERIFICATION  COMPOSITE  STATISTICS 


ASVAB  raw  scores 
to  be  predicted 

Best  ASVAB 
prediction 

r2 

Standard  error 
of  estimate 

WK  +  AR  (GT)a 

GS  +  MK  +  GI 

+  MCa 

.79 

5.2 

WK  +  AR  +  SP  (AFQT)b 

GS  +  MK  +  GI 

+  MCb 

O 

CO 

6.2 

aWK  +  AH  =  2 . ** ii 3  +  0.685  (GS  +  MK  +  01  +  MC) 
bWK  +  AR  +  SP  =  8.50  +  0.8^9  (GS  +  MK  +  GI  +  MC) 
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Cumulative 
fraction  of 
variance 


Variable 

explained 

(r2) 

Partial8 

F-statistic 

Coefficient 

ASVAB  AFQT 

.050 

82.5 

-.00545 

EDUCATION 

.  066 

59.7 

-.13726 

AGE 

.070 

11.0 

.01747 

DEPENDENTS 

,070 

0.0 

(Constant ) 

.22857 

aPartlal  F-statistic  when  all  variables  shown  are  entered  in  the 
regression  equation.  The  critical  value  is  3.84  for  the  95~percent 
confidence  level  and  6.63  for  the  99-percent  confidence  level. 
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TABLE  3 


PREDICTED 

CHANCES  OP 
Age 

SUCCESS 

17 

:  PROFILE 

1 

Mental 

ASVAB  AFQT 

Grades  of  school 

completed 

Group 

score 

>12 

12 

GED 

11 

To 

jr 

48 

I 

93-100 

90 

90 

77 

77 

72 

67 

62 

II 

65-92 

88 

89 

76 

76 

71 

66 

60 

III  A 

50-64 

84 

84 

12 

72 

66 

62 

56 

III  B 

31-49 

80 

80 

68 

68 

62 

58 

52 

IV  A 

21-30 

74 

74 

61 

61 

56 

51 

45 

IV  3 

10-20 

73 

73 

61 

6l 

55 

51 

45 

V 

0-9 

65 

65 

52 

52 

47 

43 

37 
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TABLE  4 


PREDICTED 

CHANCES  OF 
Age 

SUCCESS 

18 

:  PROFILE 

i 

Mental 

ASVAB  AFQr 

Grades  of  school 

completed 

Group 

score 

>TT 

lT 

OED 

n 

10 

_9 

^8 

I 

93-100 

91 

91 

78 

78 

73 

68 

83 

11 

65-92 

69 

90 

77 

77 

72 

67 

61 

III  A 

50-64 

85 

86 

73 

73 

67 

63 

57 

III  B 

31-49 

81 

81 

69 

69 

63 

59 

53 

IV  A 

21-30 

75 

75 

62 

62 

57 

52 

46 

IV  B 

10-20 

7'< 

74 

62 

62 

56 

52 

46 

V 

0-9 

66 

66 

53 

53 

48 

4  4 

36 

TABLE  5 


PREDICTED 

CHANCES  OF 
Age 

SUCCESS 

19 

:  PROFILE 

1 

Mental 

ASVAB  AF*)T 

Grades  of  school 

completed 

Group 

score 

>1 2 

££ 

QED 

nr 

jr 

I 

93-100 

88 

88 

75 

75 

70 

66 

60 

II 

65-92 

86 

87 

7K 

7  « 

69 

6^4 

58 

III  A 

50-6A 

82 

83 

70 

70 

65 

60 

5*4 

III  B 

31-^9 

78 

79 

66 

66 

61 

56 

50 

IV  A 

21-30 

72 

72 

59 

59 

511 

'4  9 

44  <4 

IV  B 

10-20 

71 

71 

59 

59 

5H 

i49 

^3 

V 

0-9 

63 

63 

50 

50 

*5 

n 

35 
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PREDICTED  CHANCES  OF  SUCCESS:  PROFILE  1  j 

Age  20  I 


Mental 

ASVAB  APQT 

Grades 

school 

B 

o 

o 

>leted 

Group 

score 

>12 

IS 

GED 

T3~ 

TO 

_9 

nr 

I 

93-100 

85 

85 

72 

72 

67 

62 

57 

11 

65-92 

83 

83 

71 

71 

65 

61 

55 

III  A 

50-61) 

7  9 

80 

67 

67 

61 

57 

51 

III  B 

31-49 

75 

75 

63 

63 

57 

53 

47 

IV  A 

21-30 

69 

69 

56 

56 

51 

46 

40 

IV  B 

10-20 

68 

68 

55 

55 

50 

4  3 

40 

V 

0-9 

60 

60 

H7 

47 

42 

37 

32 

TABLE  7 

PREDICTED  CHANCES  OF  SUCCESS:  PROFILE  i 

Age  21 


ASVAB  AFQT 
score 


Qrades  of  scnool  completed 


f 

\ 

i 
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TABLE  8 


PREDICTED 

CHANCES 

OF  S 
Age 

UCCESS: 

22 

PROFILE 

1 

Mental 

ASVAB  AFQT 

Grades 

of  school 

completed 

Group 

Score 

>12 

11 

GED 

11 

10 

_9 

4i 

I 

93-100 

79 

80 

67 

67 

62 

57 

51 

II 

65-92 

78 

78 

65 

65 

60 

55 

50 

III  A 

50-61J 

74 

74 

61 

61 

56 

51 

46 

III  B 

31— ^9 

70 

70 

57 

57 

52 

47 

42 

IV  A 

21-30 

63 

63 

51 

51 

45 

4l 

35 

IV  B 

10-20 

63 

63 

50 

50 

45 

40 

35 

V 

0-9 

54 

55 

4  2 

42 

37 

32 

26 
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TABLE  9 


PREDICTED  CHANCES  OF  SUCCESS:  PROFILE  1 

Age  23 


Mental 

ASVAB  AFQT 

Grades 

of  school 

comp! 

leted 

Group 

score 

>12 

12 

GED 

TT 

10 

~B[ 

I 

93-100 

75 

76 

63 

63 

58 

53 

47 

II 

65-92 

74 

7^ 

61 

61 

56 

52 

46 

Til  A 

30-64 

70 

70 

57 

57 

52 

47 

42 

III  B 

31-49 

66 

66 

53 

53 

4  8 

43 

38 

IV  A 

21-30 

59 

59 

4  7 

47 

41 

37 

31 

IV  B 

10-20 

59 

59 

46 

46 

41 

36 

31 

V 

0-9 

50 

51 

38 

33 

28 

<“» 

c.  c. 
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TABLE  10 

SUMMARY  OF  REGRESSION 
TO  SELECT  ATTRITION  COMPOSITE 


Variable 

Cumulative 
fraction  of 
variance 
explained 
(r?) 

Fartiala 

F-statistic 

Coefficient 

NUMERICAL  OPERATIONS  (NO) 

.051 

18.3 

-.00410 

COMBAT  SCALE  (CC) 

.074 

54.5 

-.01453 

EDUCATION 

.093 

45.3 

-.11009 

SPACE  PERCEPTION  (SP) 

.098 

11.1 

-.00663 

ATTENTION  TO  DETAIL  (AD) 

.100 

5.9 

-.00455 

ATTENTIVENESS  SCALE  (CA) 

.101 

5.5 

-.00597 

(Constant) 

.89118 

aPartlal  F-statlstic  when 

all  variables 

shown  are  entered 

in  the 

regression  equation.  The  critical  value  is  3.8*)  for  the  95-percent 
confidence  level  and  6.63  for  the  99-perocnt  confidence  levell. 


me 


e  ■?7,-r*'&e-+K’Vz^>*-*>  wvfBtor-  iThg*.*.'*frL’Z's 


*0*0*} 


TABLE  11 

SUMMARY  OP  REGRESSION 


TO  EXPLAIN  TOTAL 
(Using  attrition 

DISCHARGES 

composite) 

Variable 

Cumulative 
fraction  of 
variance 
ex^l^ined 

Partial® 

P-statistic 

Coefficient 

ATTRITION  COMPOSITE 

.  086 

106.7 

- . 00476 

EDUCATION 

.101 

52.8 

-.12855 

AGE 

.  ion 

7.9 

.01548 

ASVAB  AFQT 

.104 

0.0 

DEPENDENTS 

.  1  Oh 

0.0 

(Constant ) 

0.59484 

aPartial  F-statistic  when  all  variables  shown  are  entered  in  the 
aggression  equation.  The  critical  value  is  3*84  for  the  95-percent 
confidence  level  and  6.63  for  the  99-percent  confidence  level. 


\ 


% 
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TABLE  12 


PREDICTED  CHANCES  OF  SUCCESS4 »b:  PROFILE  2 
(High  school  graduates) 


Attrition 


Composite  (ATT) 

Age 

raw  score 

r~ 

IS 

19 

“■20 

21 

22 

13F 

l8o 

100 

100 

100 

100 

100 

100 

ICO 

160 

100 

100 

100 

98 

96 

95 

93 

140 

93 

92 

90 

88 

87 

85 

84 

120 

84 

82 

80 

79 

77 

76 

74 

10C 

74 

73 

71 

69 

68 

66 

65 

80 

64 

61 

61 

60 

58 

57 

55 

60 

55 

53 

52 

50 

49 

47 

46 

aSuccess  probability  «  1  -  probability  of  premature  discharge 
^Chances  of  success  -  100  (l-(0. 55484-0. 00476 (ATT)-O. 12855+0. 0. 58(AOE)) ) 
Chances  calculated  at  slightly  greater  than  100  are  reported  as  100. 
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TABLE  13 

PREDICTED  CHANCES  OK  SUCCESS®:  PROFILE  2 
(Nonhigh  school  graduates) 


Attrition 

Composite  (ATT) 

Age 

raw  score 

57“ 

IB 

11 

96 

|0" 

21 

~W 

“131 

90 

180 

99 

98 

94 

93 

91 

160 

90 

88 

87 

85 

83 

82 

80 

1*10 

CO 

79 

77 

75 

74 

72 

71 

120 

71 

69 

67 

66 

64 

63 

61 

100 

61 

60 

58 

5b 

55 

53 

52 

80 

51 

50 

48 

47 

45 

44 

4: 

60 

42 

4o 

39 

37 

36 

34 

33 

& 

Success  probability  «  1  -  probability  of  premature  discharge 
Chances  of  success  ■  100(1-(0. 59484-0.00476  (ATT)+0. 0158 CAGE))) 


TABLE  lk 


CROSS-VALIDATION  OF  PROFILES 


Percentage  of  variance 
explained  by  regression 
Cross-Validation  sample  _ r£ _ 


Profile 

1  (ASVAB  AFQT,  EDUC,  AGE) 

0.080 

Profile 

2  (ATT,  EDUC,  AGE) 

0.100 

Current 

Navy  (ASVAB  AFQT,  EDUC,  AGE,  DEP) 

0.080 

Current 

Marine  Corps  (ASVAB  AFQT,  ASVAB  GT 
EDUC) 

0.077 

Original  sample 

Profile 

1  (ASVAB  AFQT,  EDUC,  AGE) 

0.070 

Profile 

2  (ATT,  EDUC,  AGE) 

o,:oa 
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ASVAB:  An  Adventure 
in  Joint  Service  Cooperation 

by 

Major  Wayne  S.  Sellman 
Air  Force  Military  Personnel  Center 


At  the  1975  Military  Testing  Association  conference, 
I  served  as  chairman  for  a  symposium  entitled,  "Use  of 
a  Common  Aptitude  Test  for  Entry  into  All  Military  Serv¬ 
ices"  (Sellman,  1975) .  The  purpose  of  that  symposium 
was  to  provide  the  background  which  led  to  the  decision 
by  the  Assistant  Secretary  of  Defense  (Manpower  and 
Reserve  Affairs)  to  use  a  common  enlistment  test  and  to 
discuss  its  development  and  implementation.  Of  course, 
that  test  was  the  Armed  Services  Vocational  Aptitude 
Battery  (ASVAB) . 

Some  two  years  later  I  am  privileged  to  participate 
in  yet  a  second  ASVAB  symposium.  This  morning's,  how- 
over,  is  somewhat  different  from  the  previous  one  in  the 
sense  that  then  we  were  still  four  months  from  imple¬ 
mentation  and  were  speaking  to  "developmental  data.” 
Today,  we  have  22  months  experience  with  ASVAB  under  our 
belts  and  our  data  are  empirical  in  nature.  (Some  might 
say  "school  cf  hard  knocks"  data.) 

Well,  with  all  that  experience  and  data  now  in  our 
possession,  how,  you  might  ask,  is  ASVAB  working  as  a 
joint  service  aptitude  battery?  The  answer  is  -  pretty 
well  -  especially  when  you  realize  that  it  is  intended 
to  meet  the  selection  and  classification  needs  (and, 
hence,  philosophies)  of  all  four  services.  Undoubtedly, 
a  test  designed  to  serve  many  masters  won't  be  quite  as 
precise  as  one  working  for  only  one.  Yet,  in  its  multi¬ 
faceted  role,  ASVAB  is  a  good  test  (Fischl,  Raney,  and 
Seeley,  1978?  Swanson,  1978;  Valentine,  1977). 

As  you  mignt  imagine,  with  a  test  which  must  be  all 
things  to  all  people,  from  time-to-time  some  interesting 
scientific  and  management  problems  arise.  To  resolve 
such  problems,  two  joint  service  committees  have  been 
established  by  the  Office  of  the  Assistant  Secretary  of 
Defense  (Manpower  and  Reserve  Affairs)  (OASD/MaRA) .  The 
first  is  the  ASVAB  Steering  Committee.  It  is  chaired  by 
a  deputy  assistant  secretary  of  defense,  and  its  members 
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are  flag  officers  from  each  Service's  personnel  office. 
Its  charter  is  to  provide  policy  recommendations  on  ASVAB 
development  and  use  to  OASD (M&RA) .  The  second,  the  ASVAB 
Working  Group,  consists  of  testing  policy  staffers  and 
laboratory  scientists  from  each  Service  plus  representa¬ 
tives  from  the  Military  Enlistment  Processing  Command 
(MEPCOM) .  Its  responsibility  is  to  "handle"  the  on-going 
problems  of  "building,  installing,  and  maintaining"  a 
joint  service  teat. 

The  last  three  years  have  seen  some  very  interesting 
developments,  both  political  and  scientific,  in  the  ASVAB 
area.  As  Chairman  of  the  ASVAB  Working  Group,  I  have 
been  intimately  involved  in  almost  all  of  them.  Now  that 
ASVAB  has  become  an  accepted  Service  fact-of-life  end 
validity  analyses  are  substantiating  its  claims  of  scien¬ 
tific  merit,  it  seems  appropriate  that  the  evolution  of 
joint  service  cooperation  be  documented.  {In  short,  the 
behind  the  scene  ASVAB  story  can  be  told.)  The  "ASVAB 
road"  has  not  always  been  smooth  and  certainly  not  con¬ 
structed  entirely  of  "yellow  bricks"  -  at  least  not  for 
the  first  few  miles.  Nevertheless,  the  early  days  not¬ 
withstanding,  today  ASVAB  can  be  pointed  to  with  pride 
as  an  example  of  a  joint  service  project  that  worked. 

With  the  above  by  way  of  background,  I'd  like  to 
share  with  you  two  examples  of  joint  service  interaction/ 
cooperation.  They  represent,  if  not  the  ends  of  the 
cooperation  continuum,  certainly  a  close  approximation 
thereto.  Further,  they  illustrate  the  distance  traveled 
in  the  last  three  years. 

Let's  begin  with  the  low  end  of  the  scale.  In  July 
1975,  the  ASVAB  Steering  Committee  met  to  discuss  the 
status  of  ASVAB  development.  One  issue  that  surfaced 
concerned  the  inclusion  of  the  Army  Classification  Inven¬ 
tory  (ACI),  a  short  interest  test  used  to  select  soldiers 
for  combat  arms  jobs,  in  the  high  school  version  of 
ASVAB.  The  Army's  position  was  that  whatever  version  of 
the  test  was  U3ed  in  the  high  schools,  it  should  provide 
the  same  scores  as  the  versions  used  for  production  test¬ 
ing.  The  Air  Force,  Navy,  and  Marine  Corps  believed  that 
the  high  school  test  with  the  ACI  was  too  long  for  use  in 
the  schools.  With  ACI,  testing  time  was  three  hours, 
five  minutes;  without  it,  two  hours,  45  minutes.  The 
Armed  Forces  Vocational  Testing  Group  (AFVTG)  (MEPCOM* s 
predecessor  unit,  speculated  that  using  ASVAB  with  ACI 
they  might  lose  up  to  approximately  60%  of  the  partici¬ 
pating  high  schools. 
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Despite  all  efforts  by  the  ASWiB  Working  Group,  the 
ACI  issue  remained  unresolved.  It  again  was  the  main 
topic  of  discussion  for  the  ASVAB  Steering  Committee 
which  met  in  late  August  1975.  In  addition  to  the  length 
issue.  Air  Force  and  Navy  indicated  that  they  believed 
the  ACI  was  also  inappropriate  for  use  in  the  high 
schools  because  of  questionable  content.  Questions  had 
a  "weapons,  outdoors,  sports"  orientation  which  the  AFVTG 
reported  might  be  offensive  to  some  high  scnool  counsel¬ 
ors  and  students.  Additionally,  because  it  was  an  inter¬ 
est  inventory  and  not  an  aptitude  measure,  it  might  also 
present  invasion  of  privacy  problems. 

Since  the  Army  was  the  only  Service  to  use  the  ACI, 
the  Air  Force  and  Navy  recommended  to  OASD (M6RA)  that  it 
be  deleted  from  the  high  school  test  and  administered 
during  the  earlier  phases  of  processing  at  Armed  Forces 
Examining  and  Entrance  Stations  (AFEES) .  The  Army  re¬ 
tained  its  position  that  the  ACI  was  required  as  input 
into  its  classification  decisions. 

Well,  as  you  can  readily  see,  this  .is  a  classic  case 
of  the  Services  agreeing  to  disagree.  After  three  months 
further  deliberation,  OASD (M&RA)  finally  decided  to 
delete  the  ACI  from  the  high  school  ASVAB.  The  result 
of  this  interservice  squabble  was  a  six  months  delay  in 
implement ing  ASVAB  in  the  high  schools. 


Now,  let’s  look  at  joint  service  cooperation  at  its 
best.  In  February  1977,  Professor  Lee  J.  Cronbach  of 
Stanford  University  wrote  the  Assistant  Secretary  of 
Defense  (Manpower  and  Reserve  Affairs)  concerning  the 
high  school  ASVAB,  He  had  been  asked  by  Buros’  Mental 
Keasureroent  Yearbook  to  review  the  test  and  prepare  a 
critique  for  their  next  edition.  Cronbach  had  several 
criticisms — the  major  one  was  that  the  intercorrelations 
between  the  high  school  composites  were  too  high  to  be 
of  value  for  vocational  guidance.  Each  Service  computes 
its  own  set  of  composites  from  ASVAB  and  uses  them  for 
classification.  In  addition,  MEPCOM  had  derived  its  own 
composites  for  use  by  high  school  counselors.  It  was  the 
latter  ones  taken  to  task  by  Cronbach. 

As  a  result  of  the  Cronbach  letter,  the  Deputy 
Assistant  Secretary  of  Defense  (Military  Personnel  Pol¬ 
icy)  decided  that  the  high  school  composites  needed  to 
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oe  reconfigured-  In  mid-March  1977,  he  asked  the  Service 
personnel  k&D  laboratories  under  the  -nspices  cf  the 
ASVAB  Working  Group  to  develop  new  ones.  By  late  March, 
each  of  the  laboratories  had  developed  candidate  sets  of 
composites,  and  in  -*<trly  April  the  ASVAB  Working  Group 
selected  the  ones  proposed  by  the  Army  Research  Institute. 
Then,  one  week  later,  the  ASVAB  Working  Group  assisted 
MEPCOM  in  revising  the  high  school  counseling  materials 
to  reflect  the  new  composites.  Because  of  the  concen¬ 
trated  and  integrated  actions  on  the  part  of  MEPCOM,  the 
Service  personnel  R*D  laboratories,  and  the  ASVAB  Working 
Group,  all  revised  materials  wore  in  the  field  in  time 
for  school  year  1977-78.  Without  joint  service  coopera¬ 
tion,  the  entire  effort  would  have  been  impossible. 

Not  every  ASVAB-related  matter  that  comes  along 
receives  the  sarue  level  of  joint  service  consideration. 
But,  given  each  Service's  unique  procurement  and  place¬ 
ment  problems,  that  is  to  be  expected.  The  point  is  that 
with  experience  in  the  joint  service  arena,  trust  and 
rapport  between  the  Services  has  grown.  Now  when  a  prob¬ 
lem  surfaces,  the  members  of  the  ASVAB  Working  Group  con¬ 
tact  each  other  -  no  one  operates  in  u  vacuum. 

In  conclusion,  since  1974  we  have  moved  from  the 
concept  of  a  common  enlistment  eligibility  test  through 
the  difficulties  of  its  development  and  implementation  to 
an  operational  battery  administered  to  over  two  million 
examinees  annually.  I  believe  that  this  is  a  tribute  to 
the  dedication,  perseverance,  and  just  plain  hard  work  of 
all  those  who  have  been  associated  with  the  ASVAB.  Its 
utility  has  been  demonstrated;  its  support  within  OASD 
(MfcRA)  and  the  Services  is  strong.-  in  this  healthy 
environment,  1  look  forward  to  thu  next  several  years  of 
the  ASVAB  adventure  and  hope  I  can  take  part  In  still 
another  MTA  ASVAB  symposium  in  1979. 
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CODAP  in  Che  Design  of  Concurrent  Validity  Research 


Marvin  H.  Trattner 
U.S.  Civil  Service  Coasrission 


The  Civil  Service  Consission  hac  beeu  engaged  for  the  past  three 
years  in  a  larg<i  scale  project  to  assess  the  criterion  related  validity 
of  the  PACE  test.  This  test  is  used  as  part  of  the  examining  procedure 
to  fill  entry  level  vacancies  in  approximately  120  different  professional 
and  administrative  occupations  in  many  agencies  of  the  Federal  government. 
During  the  past  year  the  PACE  test  was  administered  to  220.000  individuals 
and  8.000  vacancies  were  filled  from  PACE  registers.  More  recent  college 
graduates  enter  the  federal  service  via  the  PACE  examination  than  through 
any  other  method.  Because  of  the  importance  of  the  examination  this  large 
scale  project  was  undertaken  to  assess  the  criterion-related  validity  of 
the  PACE  test.  The  criterion  related  validity  for  very  populous  PACE  occu- 
pations  is  being  determined  to  supplement  the  construct  validation  procedure 
employed  in  PACE  test  development. 

The  first  three  occupations  studied  were  Social  Security  Administra- 
tion  Claims  Authorixer,  Customs  Inspector  and  Internal  Revenue  Officer. 

These  are  occupations  which  sre  unique  to  government  service  and  for  which 
the  PACE  is  heavily  used.  Each  occupation  is  found  in  only  one  major 
governmant  agency  and  has  large  concentrations  of  employees  in  large 
arntropolitan  areas. 

A  concurrent  validity  paradigm  was  utilised  which  assessed  the  crite¬ 
rion-related  validity  of  PACE  for  the  full  performance  grade  level.  The 
PACE  test,  criterion  instruments,  s  task  inventory  and  s  biographical 
information  blank  were  administered  to  currently  employed  individuals  in 
the  grade  level  which  contained  the  largest  number  of  employees  in  the 
occupation.  Research  participants  were  tested  for  eight  to  twelve  hours 
depending  on  the  occupation  studied. 

The  task  inventory  was  analyzed  by  the  CODAP  programs.  It  can  be 
said  to  be  the  keystone  of  the  project  and  this  paper  will  focus  on  the 
pri  cdurcs  employed  and  the  results  obtained  with  the  task  inventory. 

This  was  the  first  tiae  CODAP  was  used  by  the  Commission.  For  this  pur- 
poce  we  arranged  to  install  CODAP  on  s  Forest  Service  UNIVAC  computer  st 
Fort  Collins,  Colorado.  Tasii  inventory  date  for  the  first  occupation 
studied,  the  SSA  Claims  Authorixer,  were  analysed  for  us  by  our  friends 
at  Navy  on  NOTAF.  This  was  before  output  from  Fort  Collins  was  available. 
This  effort  also  represented  the  first  major  application  of  CODAP  to 
federal  civilian  employees  outside  the  Department  of  Defense. 

The  task  inventory  listed  the  tasks  performed  in  an  occupation  grouped 
into  major  job  components  or  duties.  Each  task  statement  consisted  of  a 
transitive  verb  together  with  an  object  acted  upon.  The  subject  "I”  was 
implicit  in  the  task  statements.  Respondents  indicated  whether  or  not  they 


perform  each  task.  For  all  taskc  they  perform  they  indicated  the  relative 
amount  of  time  they  spend  performing  the  task  compared  to  all  other  tasks 
they  perform.  The  relative  time  spent  rating  was  made  on  a  seven- point 
scale  with  the  following  end  points  —  "very  much  below  average,"  and 
"very  much  above  average." 

CODAP  sums  a  respondent's  relative  t me  ratings  and  divides  each 
task  rating  individually  by  this  sum  to  provide  a  measure  of  the  relative 
time  spent  by  the  respondent  on  each  task.  The  time  spent  in  duty  per¬ 
formance  is  the  sum  of  the  relative  time  spent  in  the  tasks  which  compose 
the  duty.  These  data  are  then  fed  into  other  CODAP  programs  which  pro¬ 
vide  very  useful  analyses.  One  program  calculates  a  group  job  description 
by  averaging  the  individual  job  descriptions  for  any  specified  group  of 
respondents.  Another  program  coapares  the  respondents  with  each  other  and 
clusters  them  according  to  the  similarity  of  work  performed.  Figure  1  in 
the  handout  contains  the  first  page  of  the  group  job  description  for 
.Social  Security  Administration  Claims  Authorises 


Kesearch  Design 


Criteria  used  in  assessing  the  validity  of  the  PACE  consisted  of 
several  different  measures  of  job  performance.  Four  criterion  instruments 
were  developed:  a  job  information  test,  a  work  sample  test,  a  specially 
developed  supervisory  rating  form,  and  a  supervisory  ranking  fc>,m. 

Bach  criterion  instrument  was  scored  for  the  duties  composing  the 
job.  These  were  the  same  duties  used  to  group  the  tasks  included  in 
the  inventory.  In  order  to  obtain  an  overall  measure  of  job  success  for 
a  criterion  instrument,  the  duty  scores  were  weighted  by  duty  importance. 
The  relative  tisw  spent  in  duty  performance  was  used  as  the  measure  of 
duty  importance. 

The  CODAP  clustering  program  was  used  to  indicate  the  homogeneity  of 
the  occupation.  This  demonstrated  whether  the  research  participants  were 
performing  the  sasm  occupational  tasks  and  consequently  could  be  expected 
to  take  the  s-me  criterion  instruments. 

The  CODAP  group  job  description  listed  the  most  time  consuming  tasks 
and  duties  and  was  used  to  check  the  adequacy  of  the  criterion  measures. 

Due  to  time  constraints,  although  the  task  inventory  was  developed 
prior  to  the  criterion  instruments,  the  CODAP  output  became  available 
only  during  the  final  phases  of  criterion  construction.  Consequently 
there  was  not  a  perfect  degree,  but  nevertheless  a  high  degree  of  corre¬ 
spondence  between  time  spent  in  duty  performance  and  coverage  of  the 
criterion  measures.  Tor  each  occupation  the  task  inventory  was  adminis¬ 
tered  prior  to  the  main  data  collection  to  a  representative  sample  of 
eeployees.  In  addition  for  two  of  the  three  occupatiom  the  task  inven¬ 
tory  was  administered  to  all  research  participants.  Also,  for  Customs 
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Inapector  and  Internal  Revenue  Officer,  auparviaor*  rated  the  occupational 
taska  for  relative  difficulty. 

In  summary,  the  taak  inventory  served  the  following  functiona  - 
to  -  teat  the  homogeneity  of  the  occupation, 

•  check  the  adequacy  of  the  criterion  reaaurea, 

-  weight  the  duty  acorea  for  a  criterion  in  order  to 
oLtaln  an  overall  aeaaure  of  job  aucceaa, 

-  aelect  participanta  who  perform  in  the  doaiinanl  job 
type  ahould  the  occupation  prove  not  to  be  homogeneous. 

Figure  2  in  the  handout  deacrlbea  the  reaearch  deaign.  Although 
there  were  aone  variation*  in  the  deaign  due  to  administrative  constraints, 
the  figure  aervea  generally  to  deacribe  the  procedure. 


Taak  Inventory  Construction 

The  three  occupations  studied  each  appeared  in  only  one  Mjor  federal 
agency  and  were  one  uf  the  key  Jobs  in  thoac  agencies. 

In  each  case  we  were  assured  by  agency  management  that  the  occupation 
waa  hoaogeneoua.  It  vaa  stated  that  employees  at  the  full  performance 
level  (with  aone  miner  exceptions)  vers  performing  the  seme  tasks  and  con- 
sequently  could  be  administered  the  eaae  criterion  measuree.  This  was 
alleged  to  be  true  regardless  of  geographical  location. 

Perhaps  because  of  this  situation,  the  task  invantory  was  constructed 
with  relatively  little  difficulty.  It  took  an  average  of  7  subject  matter 
experts  working  for  a  week  to  construct  the  inventen’y.  We  generally 
employed  senior  Journeymen,  working  leaders  end  firet  level  supervisors  to 
write  She  taak  statements.  The  task  inventories  contained  an  avaraga  of 
425  tasks  and  11  duties. 


Criterion  Construction 


CODAP  data  were  used  differently  to  construct  each  of  the  criteria. 
The  firet  level  supervisor  rated  and  ranked  subordinates  who  participated 
in  the  study.  Subordinates  v*re  rated  on  e  graphic  rating  scale  which 
described  levels  of  performance  on  each  duty  end  in  some  cater  on  e  por¬ 
tion  of  e  duty.  It  was  hoped  that  focusing  attention  on  duty  performance 
would  foster  objectivity  in  auperviscry  ratings.  Consequently  more  infer¬ 
ential  trait  ratings  were  deliberately  excluded. 
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Duty  descriptions  were  written  by  the  subject  matter  experts  who 
constructed  the  task  inventory.  The  descriptions  included  a  listing  of 
the  tasks  judged  to  be  prominent  by  the  subject  matter  experts.  The  end 
points  of  the  scales  were  similarly  defined  by  them.  The  ranking  form 
consisted  of  the  duty  descriptions  with  the  scale  points  removed. 

The  job  information  test  was  a  multiple  choice  objective  test  s*as- 
uring  the  examinee's  job  knowledge.  The  work  sample  was  a  work  simulation 
in  which  problems  simitar  to  those  encountered  by  the  journeyaum  on  the 
job  wete  presented  for  solution.  An  attempt  was  made  to  make  the  work 
sample  problems  as  realistic  as  possible.  Both  tests  were  constructed  by 
subject  matter  experts  employed  in  the  field  and  also  in  agency  head¬ 
quarters.  Subject  natter  experts  assigned  each  scorable  item  to  the  most 
appropriate  duty  so  the:  duty  scores  could  be  obtained  for  each  test. 

As  stated  previously,  test  items  were  constructed  prior  to  the 
availability  of  the  CODAP  job  descriptions.  However,  when  the  items  were 
assigned  to  the  most  appropriate  occupational  duty  the  match  between  items 
scored  and  relative  time  spent  was  satisfactory.  With  cne  exception  the 
most  important  duties  contained  the  largest  number  of  te*£  items.  That 
exception  wa^  for  the  revenue  officer  duty  -  "locating  and  contacting  tax 
payer."  Moderately  difficult  test  items  could  not  be  written  for  this 
time  consuming  duty.  Table  1  serves  to  compare  the  relative  time  spent 
and  number  of  point*  scored  for  the  job  information  test  and  the  work  | 

sample  by  occupational  duty  for  the  three  occupations.  j 

I 

The  table  shows  that  the  criterion  tests  measured  from  53  to  96%  of 
the  job  content  as  determined  by  relative  time  spent  in  duty  performance. 

The  rating  and  ranking  forms  were  developed  to  record  ratings  for  each 

duty  and  consequently  they  were  used  to  smasure  performance  of  the  entire 

job.  ! 

It  would  have  been  desirable  to  distribute  items  to  the  duties  in 
proportion  to  time  spent  in  duty  performance.  However  the  match  is  rea¬ 
sonably  close  and  some  gain  nay  be  achieved  by  having  subject  aietter 
experts  construct  what  they  consider  to  be  good  itesw  without  imposing 
subject  sMtter  constraints  on  them.  As  previously  stated,  triten  overall 
job  performance  was  scored,  the  duty  scores  were  weighted  by  relative 
time  spent. 

When  the  use  of  a  test  is  challenged  in  the  courts  there  is  a  great 
advantage  to  being  able  to  document  the  relevance  of  criterion  measures 
supporting  the  use  of  the  test.  CODAP  output  is  ideal  for  this  purpose. 

For  the  Claims  Authoriser  snd  the  Revenue  Officer  occupations,  relative 
tbae  spent  in  task  and  duty  performance  was  used  as  an  indication  of  impor¬ 
tance  to  occupational  success.  For  the  Customs  Inspector  occupation  where 
law  enforcement  tasks  are  performed  and  are  judged  to  be  very  important, 
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relative  time  spent  and  relative  difficulty  level  were  summed  lu  order 
to  derive  a  measure  of  task  and  duty  importance.  Ic  was  the  sum  of 
these  two  values  that  was  used  to  wight  the  duty  scores  to  obtain  an 
overall  Measure  of  job  success.  No  ratings  of  task  importance  were 
obtained  and  hence  could  not  be  used  for  this  purpose. 


lesulta 

Task  Inventory.  Table  2  reveals  that  the  agency  sianagers  were  quite 
correct.  The  occupations  were  very  hocoogeneous  as  indicateu  by  the  aver¬ 
age  percent  overlap  for  each  total  group.  We  have  been  told  by  NOTAP 
personnel  that  they  have  never  seen  group  average  percent  overlap  values 
as  high  as  the  ones  we  obtained.  The  fact  that  these  were  single  agency 
civilian  occupations  which  dealt  with  subject  matter  unique  to  government 
service  and  that  we  only  surveyed  one  grade  level  contributed  to  achieving 
this  high  hoteogeneity. 


Criterion  Instruments 

When  the  criterion  duty  scores  for  the  various  instruments  were 
intercorrelated  only  some  convergent-discriminant  validity  was  obtained. 
For  example,  the  duty  1  score  obtained  from  the  job  information  test  gen¬ 
erally  correlated  no  higher  with  the  duty  1  score  obtained  from  the  work 
sample  than  it  did  with  other  duty  scores  obtained  from  the  work  sample. 
Also  the  various  duty  scores  obtained  for  each  instrument  tended  to  inter¬ 
correlate  very  highly.  The  short  fine  limits  for  the  instruments  probably 
precluded  reliable  differential  measurement  for  the  objective  criteria. 

For  the  Custoew  Inspector  some  convergent-discriminant  validity  was 
obtained.  Procedurally  related  duty  scores  for  the  job  information  test 
and  the  work  sample  correlated  more  highly  both  within  and  across  instru¬ 
ments  than  did  unrelated  duty  scores.  Campbell  and  Fiske  stated  in  their 
description  of  the  convergent  discriminant  validity  modal  that  it  ia  very 
rarely  achieved. 

Table  3  in  the  handout  describes  the  internal  consistency  reliability 
coefficients  obtained  for  the  criterion  instruments.  It  shows  very  satis¬ 
factory  internal  consistency  coefficients  for  the  totsl  weighted  scores 
for  the  various  criteria.  These  coefficients  are  similar  to  ones  reported 
in  the  literature  for  similar  criterion  instruments. 

PACf.  Test.  Tabln  4  contains  the  validity  coefficients  for  the  three 
occupations.  Qf  the  11  coefficients,  9  are  significant  beyond  the  .001 
level.  Many  of  the  coefficients  are  very  high  for  job  performance  cri¬ 
teria  and  indicate  that  a  great  deal  of  utility  will  accrue  to  the  use  of 
tht  PACE  test. 
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Th«  two  validity  coefficients  that  were  not  significant  were  the 
supervisory  ratings  and  rankings  for  Customs  Inspectors.  The  frequent 
rotation  of  Custoos  Inspectors  and  the  independence  of  their  worx  per- 
fonaar.ee  contributed  to  inadequate  supervisory  knowledge  of  the  research 
participants.  This  nay  be  the  cause  for  the  two  insignificant  corre¬ 
lations. 

The  results  seen  to  indicate  that  careful  construction  of  criterion 
instruments  will  pronote  demonstrations  of  high  validity.  Usually  tests 
are  validated  against  whatever  crlteila  are  conveniently  obtainable. 

Most  often  the  criterion  measure  is  some  kind  of  subjective  evaluation 
made  by  a  supervisor  or  an  instructor.  The  validities  reported  using 
these  criteria  are  less  consistent  and  of  lesser  magnitude  than  the  ones 
we  obtained. 

Few  problems  were  encountered  in  administering  either  the  task  inven¬ 
tory  or  the  criterion  measures.  No  participants  complained  that  the  job 
information  test  or  the  work  sample  contained  unfair  questions.  Some 
respondents  had  difficulty  comprehending  the  relative  time  spent  scale. 
Some  objected  to  reading  through  the  task  inventory  twice.  This  was 
mainly  because  of  the  heavy  work  load  we  had  inflicted  on  them  rather 
than  any  specific  problem  with  the  task  inventory. 

No  research  participants  were  eliminated  from  the  studies  because 
their  CODAP  job  descriptions  differed  from  the  predominant  occupational 
job  type.  A  few  could  have  been  eliminated  because  their  Job  descriptions 
had  low  overlap  with  the  major  job  type,  but  they  were  not,  due  to  the 
high  homogeneity  values  obtained  for  the  total  group. 


Use  of  the  Task  Inventory  by  Federal  Agencies 

This  was  the  first  exposure  to  CODAP  for  the  staffs  of  the  three 
federal  agencies.  We  found  some  additional  uses  for  the  CODAP  output  in 
two  of  the  agencies. 

In  the  Social  Security  Administration  the  group  job  description 
served  as  the  basis  for  the  revision  of  the  Claims  Authorlser  training 
course.  Similarly  the  Customs  Service  plans  to  employ  the  output  to 
revise  the  Customs  Inspector  course.  Also  the  Customs  Inspector  task 
inventory  was  revised  and  employed  in  a  new  work  measurement. reporting 
system.  The  reporting  system  describes  units  of  output  for  Customs  Inspec¬ 
tors  and  will  be  used  to  promote  efficiency  of  operations.  The  Intemsl 
Revenue  Service  has  not  employed  CODAP.  This  may  be  beenuse  Customs  snd 
SSA  headquarters  managers  were  more  Involved  in  the  research. 


■« 


Conclusion* 

CODAP  can  be  used  very  effectively  in  criterion  related  validity  research 
to 

-  determine  occupational  homogeneity, 

-  check  the  adequacy  of  criterion  Instruments  or 
to  construct  criterion  instruments, 

-  weight  overall  job  performance, 

-  select  research  participants. 

Additionally  it  can  be  used  to  provide  excellent  documentation  for 
the  relevance  of  criterion  measures  should  the  selection  tost  br  challenged 
in  the  courts. 
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PURPOSE 


INSTRUMENT 


TIME  REQUIRED 
IP . Supervisor 


r 

construe t 
criteria  and 
test  homogeniety 
of  job; 

select 

subjects, 

weight 

criteria 


Job  end  Demographic  Info 

J  Task  Inventory 
|  Biographical  Info  Blank 


Cr  iteria 


Job  Information  Teat 
Work  Sample 

Supervisory  Rating  form 
Supervisory  Ranking  Form 

Predictor 


2  hrs 
IS  min 


2  hrs 


1  hr 
1  hr  & 

15  min 

l  hr  & 
30  min 


PACE  Test 


4  hrs  & 
15  min 


TOTAL 


S  hrs  &  3  hrs  & 
45  min  30  min 


Figure  2.  Research  Instruments  and  Development  Procedure. 
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TABLE  1 


Comparison  of  Percent  Tiae  Spent  in  Duty  Parforaaaca  tad 
Parcaat  of  Points  Scorad  for  tha  Job  Information 
Taat  aad  Work  Saaple  for  Thraa  Occupatioaa 


Total  parcaat  88  88  85  88  53 

tin*  spant  in 
task  performance 
measured  by  cri¬ 
terion  test. 


TABLE  2 

Avarajc  Parcant  OvwrUp  for  C iaa  apant  in 
Task  Parformanca  for  Thraa  OccupaCiona 


Avarata  Parcant  Ovarlap 


SSA  Claiaa  Author ixar 
Cuatoaw  Inapactor 


Ravanua  Officar 


TABLE  3 


Internal  Consistency  Reliability  Coefficients 


for  the  Criterion 

Instruaente* 

SSA  Claias 

Custoas 

Internal  Revenue 

Criterion 

Authorirer 

Inspector 

Officer 

Jcb  Intonation 

T«*t 

.81 

-67 

.64 

Work  Staple 

.72 

.60 

.78 

Racing  Fora 

.79^ 

.57b 

.86 

TABLE  4 


Validity  of  PACE  Ttic  for  Thro*  Occupationsb 


Claims 

Authorixer 

Custoas 

Inspector 

Revenue 

Officer 

Work  Saaple 

36a 

56 

56 

Job  Information  Test 

61 

65 

66 

Supervisory  Rating 

30 

06 

25 

Supervisory  Ranking 

31 

03 

*All  coefficients  corrected  for  unreliability  in  criterion  except  Work 
Saap'.e  for  Claims  Authorizes 

k  All  coefficient*  significant  at  p  <  .001  except  Supervisory  Rating  and 
Supervisory  Ranking  for  Custom*  Inspector. 


A  Job  Analysis  Model  for  Use  Zn  Polio:'  Selection 


Leon  Z.  Wetrogan,  Ph.D. 
Cynthia  C.  Diane 
U.S.  Civil  Service  C remission 

introduction 


Che  Personnel  Research  and  Development  Center  of  the  Civil  Servioe 
Commission  as  part  of  its  mission  is  responsible  for  the  development  and 
documentation  of  the  entry  level  examinations  for  selecting  Washington, 
D.C.  Policemen.  Approximately  two  years  ago,  the  decision  was  made  to 
conduct  an  extensive  job  analysis  and  where  necessary  develop  a  new  ex¬ 
amination  procedure  for  the  entry  level  patrolman  jab.  This  decision 
was  based  on  a  number  of  factors  Including:  1)  the  job  may  have  changed 
since  the  original  research  was  conducted  some  years  ago:  2)  new  methods 
and  techniques  had  been  introduced  in  job  analysis  since  the  original 
research:  and  3)  new  advances  in  personnel  assessment  techniques  had 
been  developed  which  might  be  employed  to  measure  those  job  related 
knowledges,  skills,  abilities  and  other  characteristics,  KSAO's,  not  in¬ 
cluded  in  the  current  examination. 

A  number  of  factors  were  considered  in  the  selection  of  a  job  anal¬ 
ysis  methodology.  Among  these  factors  were:  1)  the  technique  would 
have  to  allow  for  the  collection  of  information  from  a  fairly  large  re¬ 
presentative  sample  of  subjects;  2)  the  method  of  data  collection  would 
have  to  have  face  validity  in  order  to  obtain  and  maintain  the  coopera¬ 
tion  of  participants;  and  3)  the  method  would  have  to  lead  to  the  identi¬ 
fication  of  the  important  worker  KSAO's  and  lead  to  the  documentation  of 
their  linkage  to  the  important  or  critical  job  tasks  or  behaviors. 

After  a  review  of  the  job  analysis  literature,  the  decision  was 
made  to  incorporate  the  task  analysis  procedure  and  a  task  by  ability 
matching  technique  to  accomplish  the  job  analysis.  The  major  portion  of 
this  paper  addresses  itself  to  presenting  the  job  analysis  model  and 
discussing  its  use  with  the  Washington,  D.C.  Police. 


Method  and  Results 


The  job  analysis  phase  of  the  project  was  carried  out  in  four 
stages . 

Stage  I:  The  Identification  of  Tasks  Performed  by  Police  Officers  and 
the  Development  of  a  Task  Inventory. 

In  order  to  generate  a  comprehensive  list  of  task  statesmnts,  three 
sources  of  information  were  utilised.  They  included:  1)  a  content 
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analysis  of  training  and  operational  manuals »  2)  a  five-day  brainstorming 
session  with  a  panel  of  seven  knowledgeable  police  officers,  representa¬ 
tive  by  race,  sex  and  geographic  location  in  the  City)  and  3)  observation 
by  the  present  investigators  during  ride  along  sessions  of  patrolman  at 
work. 

Next,  a  panel  of  four  polios  personnel  consisting  of  one  Lieutenant, 
two  Sergeants,  and  one  officer  was  convened,  the  panel  performed  three 
functions.  First,  they  reviewed  each  task  statement  for  accuracy  and 
clarity  of  language.  Second,  they  eliminated  task  statements  that  were 
duplications.  Third,  they  defined  the  major  duty  areas  represented  in 
the  final  pool  of  task  statements  and  sorted  the  task  statements  into 
the  duty  areas. 

The  final  pool  consisted  of  317  task  statements  grouped  under  four¬ 
teen  major  duty  areas.  This  final  pool  served  as  the  basis  for  the 
development  of  a  Preliminary  Patrolman  Task  Inventory.  The  Patrolman 
Task  Inventory  consisted  of  a  cover  letter  describing  the  purpose  of  the 
study  as  well  as  the  purpose  of  the  task  inventory.  Following  the  letter 
were  thirteen  questions  related  to  demographic,  background  and  experience 
factors  of  each  person  completing  the  inventory.  This  information  was 
collected  for  two  reasons.  First,  the  information  was  used  to  determine 
whether  the  sample  included  in  the  study  was  representative  of  the  D.C. 
Polios  force.  Second,  the  data  were  used  to  determine  if  differences  on 
certain  demographic  variables  were  associated  with  differences  in  the 
tasks  performed. 

The  remainder  of  the  Inventory  contained  the  instructions  for  com¬ 
pleting  the  inventory  as  well  as  the  task  statements.  The  instructions 
asked  each  officer  to  read  through  the  task  statamsnts  and  check  those 
which  they  had  personally  performed  during  the  previous  twelve  months. 
Next,  each  officer  was  instructed  to  rate  each  task  checked  on  two 
scales  -  relative  Importance  and  relative  time  spent.  Data  on  relative 
time  spent  were  not  collected  for  the  job  analysis  portion  of  the  pro¬ 
ject  but  were  included  to  test  some  basic  research  hypotheses.  Ratings 
on  relative  importance  were  on  a  seven  point  Likert  type  scale  from  1, 
very  much  below  average  in  importance,  to  7,  very  much  above  average  in 
importance. 

Once  the  preliminary  Inventory  had  been  developed,  it  was  sent  out 
for  review  and  comment  to  training  academy  personnel,  top  level  police 
department  aAainistrators  and  union  officials.  Feedback  from  these 
sources  indicated  that  the  Task  Inventory  was  comprehensive  and  accur¬ 
ately  described  the  job  of  patrolman. 
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Following  thy  review,  the  preliminary  Inventory  wae  administered  to 
a  sample  of  14  patrolmen,  in  a  pilot  study.  The  sample  was  representa¬ 
tive  by  race  and  sex  and  included  two  patrolmen  from  each  of  the  seven 
police  districts.  Information  oollected  during  the  pilot  study  did  not 
lead  to  any  changes  in  the  Task  Inventory,  however,  it  did  suggest  the 
need  to  expand  and  modify  the  oral  presentation  related  to  the  purpose 
of  the  research  and  the  instructions  for  completing  the  Inventory. 

Stage  Hi  Administration  of  Patrolman  Task  Inventory 


The  Patrolman  Task  Inventory  was  administered  to  a  sample  of  350 
patrolmen,  fifty  from  each  of  the  seven  districts,  during  April,  1976. 

The  sample  represents  approximately  14%  of  the  total  D.C.  patrolman 
population.  Patrolmen  were  selected  from  each  district  by  their  super¬ 
vising  sergeant.  The  only  restrictions  placed  on  the  selecting  sergeants 
were: 

1)  the  patrolmen  had  to  be  in  a  patrol  position  rather  than  a 
detective  position  or  some  other  special  assignment, 

2)  the  patrolmen  had  to  have  a  minimum  of  one  year  patrol  experi¬ 
ence  and 

3)  the  proportion  of  stales,  females,  blacks  and  whites  had  to  be 
representative  of  the  D.C.  police  patrolman  population. 

31*31  III:  Analysis  of  Task  Inventory  Data  and  Identification  of  the 
Most  Important  Tasks 

Of  the  350  officers  completing  the  Inventory,  data  on  fifteen  were 
eliminated  where:  1)  offioers  were  felt  to  have  indiscriminantly 
marked  the  task  statements  because  they  were  all  marked  with  the  same 
numerical  rating}  2)  officers  failed  to  rate  more  than  10  peroent  of  the 
tasks  which  they  had  checked  as  having  performed  in  the  previous  twelve 
months}  or  3}  offioers  had  been  performing  in  a  position  other  than 
patrolman  for  more  than  25  peroent  of  the  previous  twelve  months.  The 
final  sample  included  in  the  analyses  consisted  of  300  stales,  31  fe¬ 
males,  162  whiten  and  167  blacks.  It  should  be  pointed  out  that  for  a 
very  small  number  of  subjects  race  and  sex  data  were  not  provided,  con¬ 
sequently  the  sums  across  race  and  sex  subgroups  do  not  equal  the  335 
patrolmen  included  in  the  total  sample. 
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Separate  CODAP  Job  Description  analyses  were  performed  an  the  Rela¬ 
tive  Import an oe  and  Relative  Timo  Spent  dimension  for  the  total  sample, 
whites,  blacks,  sales  and  f wales.  Table  1  presents  the  Average  Percent 
Importance  by  all  Members  on  the  Duties  for  the  total  staple,  blacks, 
whites,  eales  and  females.  Also  included  in  the  Table  in  parentheses 
are  the  rank  orders  for  the  duties  for  the  total  smsple  and  for  each  sex 
and  race  subgroup.  The  Table  indies*  ts  perfect  agreement  across  groups 
on  the  highest  four  duties.  The  most  important  duty  was  M,  Adminis¬ 
trative  Activities,  followed  by  H,  Conducting  Preliminary  Investigations, 
E,  Patrolling  For  Crime  Prevention,  and  L,  Conducting  an  Arrest. 

In  order  to  obtain  a  more  precise  indication  of  the  agreasmnt  on 
average  percent  importance  by  all  members  for  the  duties,  Pearson  Pro¬ 
duct  Moment  Correlations  were  coeluted  between  each  of  the  race  and  sex 
groups.  The  correlation  for  Average  Percent  importance  by  all  mentors 
for  duties  between  malts  and  females  was  .92.  The  correlation  between 
blacks  and  whites  for  Average  Percent  importance  by  all  members  for 
duties  was  .99.  These  results  suggest  that  there  is  a  high  level  of 
agreement  between  the  races  and  sexes  in  terms  of  duty  importance. 

Since  our  primary  concern  involved  the  identification  of  the  most 
important  tasks,  a  task  level  job  description  analysis  was  performed. 
Separate  job  description  analyses  were  carried  out  for  the  total  sample, 
males,  females,  blacks  and  whites.  Table  2  lists  those  tasks  included 
in  the  highest  third  for  the  total  sample  on  *he  Average  Percent  Impor¬ 
tance  by  all  Members.  Inspection  of  the  Table  indicates  that  Duty  H, 
Conducting  Prelisiinary  Investigations,  contained  the  largest  number  of 
tasks  in  the  top  third  for  the  total  sample,  20.  Following  Duty  H,  and 
tied  for  second  place  with  16  tasks  each  were  Duties  N,  Administrative 
Activities,  E,  Patrolling  For  Crime  Prevention  and  L,  Conducting  an 
Arrest.  The  duty  with  the  fifth  highest  number  of  tasks  was  M,  Pre¬ 
paring  Cases  for  Court  and  Testifying.  Duties  C,  Patrolling  for  Inci¬ 
dentals,  and  J,  Conducting  Follow  Up  Investigations,  had  the  fewest 
number  of  tasks  in  the  top  third  for  the  total  sample  with  one  each. 

Table  3  presents  those  tasks  in  the  highest  third  on  Averaqe  Percent 
Importance  by  All  Members  for  **ch  of  the  raos  or  sax  groups  which  were 
not  in  the  top  third  for  the  total  sample.  Those  task  statements  in  the 
highest  third  on  the  black  task  levsl  job  description  ere  followed  by  e 
(B) .  Those  in  the  highest  third  for. whites  are  followed  by  Of).  And 
those  task  statements  in  ths  highest  third  for  females  are  followed  by 
(F) .  All  of  the  tasks  in  ths  highest  third  for  auilss  were  included  in 
the  highest  third  for  the  total  s«ple.  Inspection  of  Table  3  indicates 
that  there  were  fifteen  tasks  in  the  highest  third  of  the  female  task 
levsl  job  description  that  were  not  ir.  the  highest  third  for  the  total 


ordtr  for  each  duty  within  tach  group . 


Table  2 


Tasks  In  The  Highest  Third  On  Importance 
For  The  Total  Sample 

DOTY  A  -  PREPARING  FOR  TOUR  OF  DOTY 

1.  Check  proper  functioning  of  radio  and  siren  system 

2.  Display  proper  equi patent  while  on  duty 

3.  Load  and  unload  revolver 

4.  Clean  service  revolver 

DOTY  B  -  PATROLLING  TO  DETERMINE  VIOLATIONS 

1.  Check  for  violations 

2.  Determine  ability  of  occupant/drive r  to  operate  vehicle 
DUTY  C  -  PATROLLING  FOR  INCIDENTALS 

1.  Report  fires  and  accidents 

DOTY  D  -  PATROLLING  FOR  COMMUNITY  RELATIONS 

1.  Use  standard  automobile  equipment 

2.  Talk  to  people  on  beat  to  establish  good  relations 

DOTY  B  -  PATROLLING  FOR  CRIME  PREVENTION 

1.  To  arrest  or  prevent  the  escape  of  a  person  who  has  committed  or 
attempted  to  co— it  &  crime 

2.  Transmit  mid  receive  on  the  radio 

3.  Use  standard  emergency  equipment  assigned  to  vehicle 

4.  Cruise  at  low  speed  while  observing  lot  crimes  or  incidents 

5.  Check  suspicious  vehicles  for  F.I.C.E>  (fruits,  instrumentalities, 
contraband  and  evidence) 

6.  Check  open  doors  and  windows  for  unlawful  entry 
?.  Use  portable  radio 

8.  Check  public  places  while  on  patrol 

9.  Separate  disorderly  person (s)  from  other  persons  at  scene  of  dis¬ 
turbance. 

10.  Separate  complaintant  from  offender  in  family  argument 

11.  Respond  to  an  emotionally  tense  crowd  condition 

12.  Restore  order  after  responding  to  disorderly  person  call 

13.  Secure  crime  scene 

14.  Protect  ambulance  crew 
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Table  2  (cont.) 

Tasks  In  The  Highest  Third  On  Importance 
For  The  Total  Sample 

15.  Check  inside  business  establishments  to  audntaln  visibility 
1C.  Walk  to  attain  high  visibility 

DUTY  F  -  CONTROLLING  TRAFFIC  AND  ENFORCING  TRAFFIC  LAWS 

1.  Locate  and  identify  witnesses  at  accident  scene 

2.  Issue  traffic  violation  citation 

3.  Interview  persons  involved  in  and  witnesses  to  a  traffic  accident 
DUTY  G  -  CARING  FOR  THE  SICK  OR  INJURED 

1.  Respond  to  a  mentally  deranged  and  dangerous  person  call 

2.  Call  for  ambulance  in  an  emergency 

3.  Determine  injury  of  person (s)  at  scene  of  crime  or  accident 
DUTY  H  -  CONDUCTING  PRELIMINARY  INVESTIGATIONS 

1.  Determine  if  mentally  deranged  person  is  dangerous  to  himself  or 
others 

2.  Check  NlkLES  system  for  identification  of  person  or  property 

3.  Complete  form  I'D  251 i  Report  on  Crime  Against  Person  or  Property 

4.  Locate  suspect  in  crime 

5.  Determine  probable  cause  to  arrest  or  search 

6.  Determine  the  type  of  violation  committed 

7.  Canvass  the  surrounding  area  for  stolen  car 

8.  Interview  individuals  to  obtain  description  of  missing  person 

9.  Question  suspect  before  arrest 

10.  Identify  victims  and  witnesses 

11.  Evaluate  content  of  interview  information  obtained  from  victim  or 

witnesses 

12.  Interview  victim 

13.  Interview  witness 

14.  Isolate  suspect  of  crime 

15.  Investigate  suspicious  persons  at  scene  ot  crime 

16.  Visually  scan  entire  building  and  determine  source  of  break-in 

17.  Recover  all  iteaui  of  evidentiary  value  at  scene  of  crime 

18.  Classify  incidents  to  determine  the  appropriate  report 

19.  Interview  complainant  concerning  crime  or  incident 
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Tiible  2  (cont.) 


Tasks  In  The  Highest  Third  On  Importance 
For  The  Total  Sample 

DUTY  I  -  HANDLING  PROPERTY 

1.  Record  information  about  seized  articles  on  property  nook 

2.  Complete  form  PD  81  (property  receipt) 

3.  Complete  form  PD  82  (property  book)  when  property  is  acquired 

4.  Mark  property  to  be  used  as  evidence  for  future  positive  identifies** 
tion 

DUTY  J  -  CONDUCTING  FOLLOW-UP  INVESTIGATION 

I.  Check  with  teletype  room  for  repossession  o;  impounding  of  stolen 
car 

DUTY  K  -  PATROLLING  TO  APPREHEND  OFFENDERS 

1.  Help  secure  the  safety  of  an  officer  in  trouble 

2.  Describe  direction  of  auto  to  dispatcher  when  in  pursuit 

3.  Describe  vehicle  to  dispatcher  when  in  pursuit 

4.  Restrain  hostile  violators 

5.  Locate  wanted  person 

6.  Pursue  suspects  on  foot 

DUTY  L  -  CONDUCTING  AN  ARREST 

1.  Complete  an  arrest  for  a  misdemeanor 

2.  Advise  suspect  of  rights 

3.  Prepare  form  PD  251,  Event  Report 

4.  Prepare  form  PD  255,  Arrest  Report 

5.  Prepare  form  PD  163,  Prosecution  Report 

6.  Keep  searched  prisoner  awey  from  others  not  searched 

7.  Advise  suspect  he  is  under  arrest  and  inform  him  of  the  charge 

8.  Seize  the  weapon  from  e  suspect 

9.  Obtain  a  signed  waiver  of  rights 

10.  Handcuff  i  suspect  or  prisoner 

II.  Cover  front  end  rear  entrances  at  building  where  suspect  is  hiding 

12.  Place  arrested  suspects  in  transport  vehiclss 

13.  Arrange  for  transport  vehicles  for  suspect  or  prisoner 

14.  Search  for  evidence  and  weapons  incidental  to  a  lawful  arrest 

15.  Establish  reasonable  grounds  that  subje-t  to  be  apprehended  has  com¬ 
mitted  the  crime 

16.  Search  the  suspect  for  fruits,  instrumentalities,  contraband,  and/or 
evidence 
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Table  2  (cont.) 

Tasks  In  The  Highest  Third  On  Importance 
For  The  Total  Staple 

DOTY  M  -  PREPARING  CASES  FOR  COURT  AMD  TESTIFYING 

1.  Report  to  U.S.  Attorney's  Office 

2.  Prepare  a  traffic  case  or  leaser  misdemeanor 

3.  Prepare  Court  Papers 

4.  Produce  evidence  in  court  for  presentation  at  trial  or  hearing 

5.  Notify  witnesses  of  their  scheduled  appearance  in  court 

6.  Record  naaes  and  addresses  of  all  witnesses  of  an  incident 

7.  Testify  in  felony  or  serious  Misdemeanor  cases 

8.  Couplet*  PD  140  (Court  Attendance  Slip) 

9.  Relate  facts  of  case  to  U.S.  Attorney  or  Corporation  Counsel 

10.  Present  case  to  grand  lory 

11.  Testify  at  preliminary  hearing 

12.  Report  to  court 

DUTY  N  -  ADMINISTRATIVE  ACTIVITIES  (SUPPORTIVE) 


Use  police  communication  eystea 
Uee  the  call  box  while  on  petrol 
Reooro  run  and  tine  on  run  pad 

Make  proper  notifications  related  to  a  crime  or  incident 
Check  ell  fluid  levels  in  car 
Check  vehicle  for  damages 

Check  emergency  equipment  in  scout  car  (lights,  siren,  etc.) 

Complete  inspection  report  (PD  7\-3)  on  vehicle 

Receive  end  acknowledge  assignment  from  radio  dispatcher 

Cell  for  necessary  assistance 

Aid  in  training  of  rookie  policemen 

Obtain  the  report  numbers  after  c  run  from  the  radio  dispatcher 
Go  beck  into  service  upon  coaqpletion  of  e  run 

Book  suspect  (complete  forms  251,  255,  47,  163,  81,  81-A,  82,  end 
PD  68) 

Operate  two-way  radio 

Inform  coenuni cations  branch  of  the  disposition  of  assignment 


1147 


'''•  %•&**  -  '  S® »-  '-  -K>»  ^^sv- 


Table  3 

Tasks  In  The  Highest  Third  On  Importance 
For  The  Race  Or  Sex  Subgroups 

DUTY  A  -  PREPARING  FOR  TOUR  OF  DUTY 

1.  Pick  up  daily  hot  sheet  (F) 

2.  Inspect  crime  steps  for  offense  patterns  (F) 

DUTY  B  -  PATROLLING  TO  DETERMINE  VIOLATIONS 

1.  Check  for  permits  and  their  validity  (F) 

DUTY  D  -  PATROLLING  FOR  COMMUNITY  RELATIONS 


1.  Assist  motorist,  in  autoatobile  emergencies  such  as  lost  keys,  stalled 
auto,  flat  tire,  etc.  (B) 

2.  Establish  communications  with  special  interest  groups  in  the  com¬ 
munity  (F) 

3.  Use  map  to  determine  shortest  route  from  one  location  to  another 
(F) 

4.  Inform  citizens  of  how  to  make  hemes  more  secure  (B) 


DUTY  F  -  CONTROLLING  TRAFFIC  AND  ENFORCING  TRAFFIC  LANS 

1.  Use  flares  at  accident  scene  to  prevent  further  accidents  (B) 

2.  Observe  traffic  conditions  (N) 

DUTY  H  -  CONDUCTING  PRELIMINARY  INVESTIGATIONS 

1.  Arrange  for  crime  scene  search  (N) 

2.  Describe  evidence  involved  in  crime  in  notebook  (F) 

3.  Fill  out  PD  106  (Flash  Lookout)  (B) 

4.  Read  broadcast  from  PD  106  (B) 

5.  Determine  need  for  additional  manpower  at  a  ctime  scene  or  unusual 
incident  (F) 

6.  identify  friends  and  relatives  of  adssing  person  for  questioning 
(F) 

7.  Identify  persons  entering  or  leaving  crime  sosne  (F) 

8.  Request  owner  to  report  to  building  following  an  incident  or  crime 
(F) 
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Table  3  (cont.) 

Task*  In  TIte  Highest  Third  On  Importance 
For  The  Race  Or  Sex  Subgroups 

DOTY  I  -  HANDLING  PROPERTY 

1.  Obtain  from  claimant  positive  identification  of  property  such  as 
serial  number,  distinguishing  marks,  etc.  (B) 

2.  Place  evidence  in  evidenoe  locker  (P) 

'? 

DOTY  J  -  CONDUCTING  POLLON-UP  INVESTIGATION 

1.  Maintain  communication  with  people  you  deal  with  where  a  follow-up 
investigation  is  necessary  (P) 

2.  Check  hot  sheet  PD  664  for  stolen  car  or  missing  persons  (P)  j 


DOTY  K  -  PATROLLING  TO  APPREHEND  OPPENDERS 

1.  Chase  fleeing  suspect  with  vehicle  (W) 

DOTY  L  -  CONDUCTING  ?N  ARREST 

1.  Use  physical  force  to  complete  arrest  (W) 

DOTY  M  -  PREPARING  CASES  FOR  COURT  AND  TESTIFYING 

1.  Pick  up  evidence  from  appropriate  clerk  for  presentation  at  trial  or 
hearing  (B) 

DOTY  N  -  ADMINISTRATIVE  ACTIVITIES  (SUPPORTIVE) 

1.  Notify  shop  official  and  radio  dispatcher  when  radio  is  malfunction¬ 
ing  (B) 

2.  Forward  to  appropriate  agency  any  evidence  not  processed  by  or 
analysed  by  MPD  (B) 

3.  Record  information  from  telephone  conversations  (P) 

4.  Answer  telephone  (B) 

5.  Place  prisoner  into  vehicle  (3) 

6.  Transport  prisoner  to  hospital,  court,  police  station  house,  central 
cell  block  (B) 

7.  Log  final  dispoaition  of  caae  on  district  station  house  arrest  bock 
(F) 
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xanple.  The  duty  with  the  largest  number  of  tasks  was  H,  Conducting  Pre¬ 
liminary  Investigations. 


Further  inspection  of  Table  3  points  out  that  twelve  tasks  were  in 
the  highest  third  for  blacks  which  were  not  in  the  highest  third  for  the 
total  sample.  Of  these  twelve  tasks,  five  related  to  Duty  N,  Adkninis- 
trativa  Activities.  Two  each  to  Duties  D  and  H,  and  one  each  to  Duties 
F,  X  and  M.  According  to  Table  3,  the  task  level  job  description  for 
whites  led  to  the  addition  of  only  four  tasks.  There  was  one  each  re¬ 
lated  to  Duties  F,  H,  K  and  L.  Those  tasks  in  the  top  third  for  the 
total  sample  as  well  as  those  in  the  top  third  for  either  race  or  sex 
subgroup  were  retained  for  further  analysis. 


Stage  IV:  The  Identification  of  the  Domain  of  KSAO's  and  Their  Linkup 
to  the  Important  Tasks 


The  identification  of  the  important  tasks  was  viewed  only  as  an 
intermediate  step  in  the  30b  analysis  research.  The  primary  goal  of  the 
job  analysis  was  to  determine  the  KSAO's  necessary  for  successful  per¬ 
formance  of  the  patrolman  job.  Consequently,  the  task  analysis  results 
were  used  as  the  basis  for  identifying  the  important  KSAO's. 


In  order  to  arrive  at  a  preliminary  pool  of  KSAO's,  previous  job 
analysis  research  both  related  and  nrelated  to  police  work  were  re¬ 
viewed.  Included  in  the  review  was  the  work  by  McCormick  et  al  with  tht. 
PAQ,  Fleishman  with  hi*  task  taxonomy  work,  Furcon  and  Baehr  on  the 
Skills  Attributes  Inventory  and  Landy  and  Farr  with  their  Police  Per¬ 
formance  Description  Scales. 


Based  on  the  review,  77  KSAO's  and  their  definitions  were  identi¬ 
fied.  The  KSAO's  were  next  sorted  by  the  present  investigators  into 
four  broad  areas:  Cognitive,  Social-Personal,  Perception,  Physical.  In 
the  Cognitive  area  there  were  18  abilities  including  for  example,  oral 
coomunication ,  number  facility,  deductive  reasoning,  creativity,  etc. 

The  social  personal  domain  consisted  of  24  attributes  including  toleranoe, 
porserveranoe,  leadership,  empathy,  etc.  The  perception  area  included  16 
KSAO's  among  which  were  color  discrimination,  near  visual  acuity,  visual 
form  perception,  and  size  perception.  The  physical  domain  consisted  of 
19  KSAO's  including  dynasdc  strength,  stamina,  mulvilimb  coordination, 
eye  hand  coordination,  etc.  In  order  to  insure  the  completeness  of  the 
KSAO  pool  and  the  clarity  of  the  definitions,  the  list  of  KSAO's  along 
with  their  definitions  was  reviewed  by  a  panel  of  police  personnel.  The 
panel  included  feur  patrolmen  and  one  Lieutenant.  The  review  led  to  ccme 
minor  modifications  in  the  KSAO  definitions  but  no  additions  or  deletions 
to  the  total  KSAO  list. 
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In  order  to  reduce  the  list  c*.'  KSAO's  to  «  more  manageable  number, 
e  sample  of  twenty-one  officers,  three  from  each  of  the  seven  districts 
were  asked  co  rate  each  KSAO  in  terms  of  its  importance  for  overall  job 
success.  Along  with  the  list  of  KSAO’s  each  officer  was  given  their 
definitions.  Officers  rated  the  KSAO’s  on  a  five  point  Likert  type  scale 
from  1,  the  ability  or  personal  characteristic  is  of  no  importance  for 
successful  performance  of  the  police  job,  to  5,  the  ability  or  personal 
characteristic  is  extremely  important  for  successful  performance  of  the 
police  job.  The  mean  and  standard  deviation  of  the  ratings  for  each 
KSAO  was  computed.  KSAO’s  with  a  mean  rating  of  3  or  higher  and  a 
standard  deviation  less  than  1.0  were  retained  for  father  analysis. 

The  final  list  consisted  of  ten  cognitive,  twelve  social-personal,  ten 
perception  and  eleven  physical  KSAO' s.  Table  4  presents  a  list  of  these 
KSAO's  along  with  their  definitions. 

The  next  step  in  the  job  analysis  process  involved  linking  the 
forty-three  KSAO's  to  tha  137  important  tasks  and  rank-ordering  the 
KSAO's  in  terms  of  their  importance  for  successful  performance  of  the 
tasks.  In  order  to  link  the  KSAO's  to  the  137  tasks,  four  ability  by 
task  rating  forms  were  developed,  one  for  each  of  the  four  KSAO  domains. 
Table  5  presents  a  sample  of  one  page  from  the  Ability  by  Task  rating 
form  for  the  cognitive  domain.  Along  with  the  rating  form  and  in¬ 
structions,  each  rater  was  given  a  set  of  definitions  for  the  KSAO's. 

The  instructions  asked  each  officer  to  begin  by  getting  a  clear  under¬ 
standing  of  the  meaning  of  each  Ability  contained  in  the  Ability  by 
Task  rating  form.  Next  they  were  to  begin  with  task  1  and  rate  how  im¬ 
portant  each  of  the  abilities  was  for  differentiating  superior  from 
barely  acceptable  performance  of  each  of  the  137  tasks,  lutings  were 
obtained  on  a  5  point  Likert  type  scale  from  1,  the  ability  or  personal 
characteristic  is  of  no  importance  for  differentiating  superior  from 
barely  acceptable  performance  of  the  task,  to  5,  the  ability  or  personal 
characteristic  is  extremely  important  for  differentiating  superior  frcm 
barely  acceptable  performance  of  the  task. 

A  representative  sample,  by  race  and  sex,  of  ten  officers  from  each 
of  the  seven  districts  participated  in  this  phase  of  the  study.  Because 

of  the  amount  of  time  required  to  rate  the  KSAO's  in  the  four  domains 

with  the  137  tasks,  each  officer  rated  the  tasks  against  only  two  of  the 
KSAO  dosuins.  Consequently,  thirty  five  ratings  were  obtained  on  each 
ability  by  task  combination. 

In  order  to  rank  order  the  KSAO's  in  terms  of  their  overall  im¬ 
portance,  the  mean  of  the  ratings  for  asch  KSAO  across  the  137  tasks  was 
determined.  This  involved  computing  the  mean  ratis«g  given  by  an  officer 

for  each  KSAO  across  the  137  task  statamants.  Next,  the  mean  of  thase 


Table  4 

Definition*  of  Skill*,  Knowledge*,  Abilities,  and  other 
Characteristics  Retained  for  Task  Matching 

Cognitive 

1.  oral  Coacunlcatlon  -  ability  to  communicate  ideas  with  spoken  words. 

2.  Deductive  Reasoning  -  ability  to  apply  a  broad,  general  idea  or 
principle  effectively  to  a  particular  problem  or  case. 

3.  Inductive  Seasoning  -  ability  to  find  toe  most  appropriate  general 
concepts  or  rules  which  fit  sets  of  data  or  which  explain  hew  a  given 
series  of  individual  items  are  related  to  each  other.  It  involves 
the  ability  to  ccnbine  conflicting  facts >  to  logically  proceed  from 
individual  cases  to  general  principles. 

4.  Written  Cowemnlcatlon  -  ability  to  write  clear  and  concise  letters, 
reports,  descriptions,  or  instructions. 

5.  Judgoent  -  ability  to  solve  a  problem  when  all  the  necessary  facts 
to  solve  the  problem  are  not  given. 

6.  following  Rules  and  Procedures  -  ability  to  follow  rules  and  proce¬ 
dure*  in  working  out  job  problems. 

7.  Problem  Sensitivity  -  ability  to  recognise  or  identify  the  existence 
of  problems.  It  does  not  include  the  reasoning  related  to  solving 
the  problems. 

8.  Problem  Solving  -  ability  to  find  practical  ways  of  dealing  with 
problems  and  situations. 

9.  Information  Appraisal  -  ability  to  evaluate  information  of  an  uncer¬ 
tain  or  conflicting  nature. 

10.  Verbal  Comprehension  -  ability  to  understand  the  meaning  of  words 
ar>d  the  ideas  associated  with  them. 
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Table  4  (cont.) 

Definition*  of  Skills,  Knowledges,  Abilities,  and  Other 
Characteristics  Retained  for  Task  Matching 

Social-Personal 

1.  Pressure  of  Tisie  -  ability  to  work  fast  and  accurately  in  situations 
where  there  is  tine  pressure  or  emotional  strain. 

2.  Tolerance  -  ability  to  put  up  with  and  handle  verbal  abuse  from  a 
person  or  a  group. 

3.  Working  to  Get  Ahead  -  a  liking  for  work  with  chances  for  getting 

ahead. 

4.  Leadership  -  ability  to  take  the  lead  or  take  charge  when  working  or 
dealing  with  others. 

5.  Cheerfulness  -  ability  to  stay  pleasant  and  good-tempered  in  dealing 
with  people. 

6.  Teen  Work  -  ability  to  work  as  a  member  of  a  group 

7.  Dealing  with  Attack  -  willingness  to  use  physical  force  in  dealing 
in  hostile  situations. 

8.  Working  Outside  -  willingness  to  work  outdoor*  in  all  kinds  of 
weather. 

9.  Repetitiveness  -  ability  to  perform  the  same  tanks  over  and  over 
without  getting  bored  or  careless. 

10.  Composure  -  ability  to  stay  calm  and  level-headed  in  difficult,  un¬ 
expected,  or  emergency  situations. 

11.  rlexi hllity  -  ability  to  handle  unexpected  changes  cn  the  job,  such 
as  new  schedules,  new  routines,  or  transfers  to  different  jobs. 

12.  Dealing  with  People  -  ability  to  deal  with  people  politely  and  help¬ 
fully,  beyond  the  giving  and  receiving  of  instructions. 
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Definitions  of  Skills,  Knowledges,  Abilities,  and  Other 
Characteristics  Retained  for  Task  Matching 

Perception 

1.  Visualisation  ~  the  formation  of  mental  images  of  figures  or  objects 
as  they  will  appear  after  certain  changes  such  as  unfolding,  rota* 
tion  or  movement  of  scam  type. 


Hi 
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2.  Depth  Perception  -  ability  to  judge  whether  objects  are  near  or  far 
away. 

3.  Wear  Visual  Acuity  -  ability  to  see  the  details  of  nearby  objects 
clearly  (within  normal  reading  distance). 

4.  *ar  Visual  Acuity  -  ability  to  see  the  details  of  distant  objects 
clearly  (beyond  normal  reading  distance) . 

5.  Visual  Tom  Perception  -  ability  to  perceive  important  detail  or  con¬ 
figuration  in  the  environment. 

6.  Closure  -  ability  to  mentally  organise  a  disorganised  field  into  a 
single  picture. 

7.  Wight  Vision  -  ability  to  "see  in  thu  dark"  or  to  pick  up  shapes  and 
movement  when  lighting  is  poor  or  low. 

6.  Sise  Perception  *  ability  to  estimate  about  how  many  objects  or 
people  there  are  in  a  certain  space. 

9.  Peripheral  Vision  -  ability  to  see  "out  of  the  comer  of  the  eye" 
when  looking  straight  ahead  so  as  to  be  aware  of  things  or  motion  to 
the  side. 

10.  Sensory  Acuity  -  ability  to  stay  alert  over  extended  periods  of 
time. 
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Table  4  (cont.) 

Definitions  of  Skills,  Knowledges,  Abilities,  and  Other 
Characteristics  Retained  for  Task  Hatching 

Physical 

1.  Explosive  Strength  -  ability  to  expend  a  maxima  amount  of  energy 
in  one  or  a  series  of  explosive  muscular  acts.  The  ability  may  be 
involved  in  acts  such  as  jumping  or  sprinting  or  in  throwing  objects 
for  a  distance. 

2.  Stamina  -  ability  involves  the  capacity  to  maintain  physical  activity 
over  prolonged  periods  of  time. 

3.  Static  Strength  -  ability  to  maintain  a  high  level  of  muscular  exer¬ 
tion  for  some  minimum  period  of  time.  This  involves  the  degree  of 
muscular  force  exerted  against  a  fairly  immovable  or  heavy  object  in 
order  to  lift,  push  or  pull  that  object. 

4.  Gross  Body  Coordination  -  ability  to  use  the  trunk,  arms  and  legs 
together  in  movement. 

5.  Hultlllmb  Coordination  -  ability  to  coordinate  the  movements  of  two 
or  more  limbs  (e.g.  two  legs,  two  hands,  one  leg,  and  one  hand).  It 
is  most  common  to  tasks  where  the  body  is  at  rest  (e.g.  seated  or 
standing)  while  two  or  more  limbs  are  in  motion. 

6.  Reaction  Time  -  ability  to  react  quickly  to  signals,  unexpected 
situations,  or  emergencies. 

7.  Kanual  Dexterity  -  ability  to  make  skillful,  coordinated  movements 
of  a  hand,  or  of  a  hand  together  with  its  arm.  It  may  involve  mani¬ 
pulation  of  objects  (e.g.,  blocks,  pencils),  but  does  not  extend  to 
machine  or  equipment  control  (e.g.,  levers,  dials). 

8.  Arm /Rand  Positioning  -  ability  to  make  precise,  accurate  movements 
of  the  hands  and  arms. 

9.  Continuous  Muscular  Control  -  ability  to  exert  continuous  control 
°ver  external  devices  through  continual  use  of  body  limbs. 

IQ.  Bye-Hand  Coordination  -  ability  to  coordinate  hand  movements  with 
visual  stimuli. 

11.  Rate  of  Arm  Movement  -  ability  to  make  gross,  rapid  arm  movements. 
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•cans  was  computed  across  ratsrs  for  sach  KSAO.  This  analysis  was  par- 
formed  for  the  total  sample  as  wall  as  for  each  race  and  sex  subgroup. 

Table  6  presents  the  means  for  each  KSAO  for  the  total  saaple  and 
for  each  of  the  four  subgroups.  Inspection  of  the  Table  indicates  that 
Following  Rules  and  Procedures «  Judgment,  Oral  Cotmsuni cation  and  Inf or- 
aation  Appraisal  were  seen  ta  the  most  important  cognitive  KSAO's. 

Within  the  Social-Personal  domain,  Pressure  of  Time,  Repetitiveness, 
Leadership  and  Teamwork  were  viewed  as  the  highest  in  importance.  Look¬ 
ing  at  the  means  for  the  Perception  and  Physical  KSAO's  indicates  a 
downward  shift  in  the  mean  ratings.  This  suggests  that  generally  KSAO's 
in  these  areas  are  of  lesser  importance  in  differentiating  successful 
from  barely  acceptable  performance  of  the  most  important  job  tasks.  In 
the  perception  domain.  Near  Visual  Acuity,  Par  Visual  Acuity  and  Visual¬ 
isation  obtained  the  highest  mean  ratings.  For  the  Physical  KSAO's  Eye- 
Hand  coordination,  Manual  Dexterity  and  Arm/Hand  positioning  were  rated 
the  highest. 

In  order  to  determine  if  there  was  s  high  level  of  agreement  in 
mean  ratings  between  the  racial  subgroups  and  asx  subgroups,  Pearson 
Product  Moment  Correlations  were  computed  across  ths  swan  ICSAO  ratings 
for  blacks  and  whites  and  for  males  and  females.  The  correlation  between 
the  black  and  white  mear.  KSAO  ratings  was  .8#  and  the  correlation  between 
the  male  and  female  mean  KSAO  ratings  was  .72.  These  correlations  would 
suggest  a  high  degree  of  agreement  between  the  races  and  Baxes  in  terms 
of  mean  KSAO  importance.  Consequently,  the  development  of  an  examination 
plan  is  to  be  based  on  the  KSAO  means  for  the  total  sasple. 


Conclusion 


Generally,  it  is  felt  that  the  job  analysis  model  utilised  in  the 
present  study  sccomplishsd  our  primary  goals  of  identifying  those  KSAO's 
neoassary  for  successful  performance  of  the  police  job  and  linking  them 
to  important  job  tasks .  The  procedure  was  also  useful  in  that  it  allowed 
for  data  collection  on  a  large  representative  sample  and  at  the  same  time 
possesses  sufficient  face  validity  to  encourage  participation. 

However,  the  ultimate  tast  of  the  job  analysis  procedure  will  lit 
in  the  validity  of  the  instruments  developed  from  this  information  about 
the  job.  The  next  steps  in  the  research  include:  1)  developing  an  sxa- 
mination  plan;  2)  developing  selection  procedures;  3)  developing  criter¬ 
ion  inotrments  including  possibly  new  supervisory  ratings;  and  finally, 
41  conducting  a  criterion-related  validity  study  including  cross- 
validation. 
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Table  6 


"*“  “tin9‘  5“  **•  ““'*  tot  th.  Tot.l  SMpl. 
*nd  *•<*  tod  sex  Subgroup. 


Skill,  Knowledge, 

Character!  »t  <  ™  TetJkl 


Cognitive 

Oral  Coamunicaticn 
Seductive  Reawning 
Inductive  Rea»oning 
Written  Communication 
•Judgment 


*ui“  * 

Problem  Seneitivity 
Problem  Solving 
Information  Appraisal 
Verbal  Comprehension 


3.08 

2.67 

2.59 

2.33 

3.15 

3.48 

2.36 

2.50 

2.70 

2.63 


Social-Personal 
Pressure  of  time 
Tolerance 

•forking  to  Get  Ahead 
bar  ,?*r»hip 
Cheerfulness 
Teamwork 

Dealing  with  AttacA 
Working  Outside 
Repetitiveness 
Composure 
Flexibility 
Dealing  with  People 

Perception 
Visualisation 
Depth  Perception 
Wear  Visual  Acuity 
Far  Visual  Acuity 
Visual  Form  Perception 
Closure 
Night  Vision 
Sise  Perception 
Peripheral  Vision 
Sensory  Acuity 


3.24 
2.46 
2.78 
2.87 

2.24 
2.86 
2.21 
2.11 
2.86 
2.73 
2.10 
2.73 


1.93 

1.90 

2.33 

1.98 

1.88 

1.83 

1.88 

1.81 

1.85 

1.87 


Black 


3.09 

2.58 

2.54 

2.33 

3.03 

3.41 

2.24 

2.37 

2.46 

2.52 


3.31 

2.62 

2.84 
2.93 
2.27 
2.92 
2.35 
2.18 
3.04 

2.85 
2.16 

2.86 


1.91 
1.72 
2.05 
1.87 
1.84 
1.82 

1.92 
1.82 
1.86 
1.99 


White 


3.08 

2.77 

2.64 

2.34 

3.27 

3.55 

2.50 

2.64 

2.96 

2.73 


3.13 

2.18 

2.70 

2.78 

2.19 

2.75 

1.98 

1.99 
2.55 
2.54 
2.01 
2.52 


1.95 

2.08 

2.63 

2.09 

1.92 

1.85 

1.83 

1.80 

1.85 

1.75 


Male 


3.03 

2.65 

2.56 

2.33 

3.05 

3.40 

2.30 

2.44 

2.62 

2.62 


3.43 

2.67 

2.95 

3.06 

2.45 

3.15 

2.27 

2.27 

3.12 

2.90 
2.34 

2.91 


1.86 

1.84 

2.29 

1.95 

1.82 

1.80 

1.83 

1.72 

1.79 

1.72 


Female 


3.25 

2.74 

2.69 

2.37 

3.46 

3.73 

2.55 

2.67 

2.95 

2.63 


2.83 

1.98 

2.42 

2.47 

1.79 

2.21 

1.86 

1.77 

2.27 

2.37 

1.58 

2.35 


2.14 

2.07 

2.47 

2.06 

2.06 

1.92 

2.01 

2.11 

1.05 

2.34 
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Tablo  6  (cont.) 

Mean  Rating*  for  the  KSAO’s  for  the  Total  Sample 
and  Race  and  Sex  Subgroup* 


Skill,  Knowledge, 


Whit* 

Female 

Physical 

Explosive  Strength 

1. 54 

1.49 

1.63 

1.66 

1.29 

Stamina 

1.69 

1.63 

1.77 

1.82 

1.38 

Static  Strength 

1.51 

1.44 

1.63 

1.64 

1.25 

Gross  Body  Coordination 

1.72 

l.t‘8 

1.77 

1.87 

1.37 

Multilimb  Coordination 

1.95 

i.ri 

1.97 

2.03 

1.79 

Reaction  Time 

1.92 

1.94 

1.89 

2.00 

1.75 

Manual  Dexterity 

2.01 

2.04 

1.94 

2.11 

1.77 

Arm/Hand  Positioning 

2.00 

2.03 

1.95 

2.09 

1.79 

Continuous  Muscular  Control 

1.70 

1.69 

1.72 

1.81 

1.45 

Eye-Hand  Coordination 

2,33 

2.34 

2.32 

2.48 

2.02 

Rat*  of  Arm  Movement 

1.73 

1.73 

1.73 

1.86 

1.43 

rB lacks  and  Whites  »  .88 
rMales  and  Females  •  .72 
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Table  6  (cont.) 

Mean  Ratings  for  the  KSAO's  for  the  Total  Sample 
and  Race  and  Sex  Subgroups 


Skill,  Knowledge,  f 


Ability,  Othr-r  Characteristics 

Total 

Black 

White 

K..J.e 

Female 

Physical 

Explosi  e  Strength 

1.54 

1.49 

1.63 

1.66 

1.29 

Stamina 

1.69 

1.63 

1.77 

1.82 

1.38 

Static  Strength 

1.51 

1.44 

1.63 

1.64 

1.25 

Gross  Body  Coordination 

1.72 

1.68 

1.77 

1.87 

1.37 

Multilirab  Coordination 

1.95 

1.94 

1.97 

2.03 

1.79 

Reaction  Time 

1.92 

1.94 

1.89 

2.00 

1.75 

Manual  Dexterity 

2.01 

2.04 

1.94 

2.11 

1.77 

Ars/Hend  Positioning 

2 .00 

2.03 

1.95 

2.09 

1.79 

Continuous  Muscular  Control 

1.70 

1.69 

1.72 

1.81 

1.45 

Eye-Hand  Coordination 

2.33 

2.34 

2.32 

2.48 

2.02 

Rate  of  Arm  Movement 

1.73 

1.73 

1.73 

1.86 

1.43 

rBlacks  and  Whites  »  .88 
rKales  and  Females  •  .72 
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Federal  and  technical  guidelines  have  made  job  analysis  an 
Important  requi resent  In  the  developsent  of  any  permonuel  selection 
tost.  Because  of  this  require  sent  there  has  been  a  resurging  Interest 
In  the  evaluation  and  comparison  of  various  job  analysis  methodologies 
for  test  developsent.  Prominent  among  the  aethods  being  considered  is 
the  Cosprehenslve  Occupational  Data  Analysis  Program  (CODAP)  approach. 
This  approach  derives  froa  the  research  of  Or.  Baysond  Christal  and 
his  associates  and  was  designed  primarily  to  evaluate  and  develop 
military  training  programs  and  to  assist  In  job  classification.  CODAP 
has  seen  limited  use  In  the  civilian  section,  and,  as  far  as  we  know, 
has  mot  yet  been  used  outside  the  U.  S.  Civil  Service  Commission  as  a 
primary  job  analysis  method  for  test  development.  In  the  U.  S.  Civil 
Service  Commission,  It  has  been  used  to  assist  In  criterion  develop¬ 
ment  and  research  participant  selection  In  the  validity  research  for 
the  Professional  and  Administrative  Career  Examination  (PACE),  and  to 
provide  the  primary  source  of  task  end  duty  data  for  entry  level  police 
officers  and  firefighters.  This  paper  generally  discusses  some  of  the 
modifications  of  CODAP  and  some  of  the  issues  that  derive  from  the  use 
of  CODAP  In  a  test  development  project.  Focus  will  be  primarily  on  the 
firefighter  job. 

Despite  its  name,  CODAP  represents  sore  than  a  sophisticated 
system  of  computer  programs  for  summarising  job  data.  It  represents 
the  concept  of  job  analysis  that  job  incumbents,  rating  the  relative 
time  spent  on  tasks  they  perform,  oan  provide  an  objective  end  accu¬ 
rate  description  of  their  job.  This  description  oan  be  a  useful  first 
step  in  the  development  of  an  examination. 

However,  the  job  statement  is  only  as  precise  as  the  tasks  and 
duties  included  in  the  inventory.  Since  the  inventory  developed  for  a 
selection  test  usually  focuses  on  only  a  single  job  and  must  provide 
information  conducive  to  the  Identification  of  the  knowledge*,  skills, 
abilities  sad  other  characteristics  (KSAGs)  required  to  successfully 
aoqulre  sad  perform  the  job,  there  ax*  several  Important  considerations. 

First,  the  task  statements  must  be  specific  enough  to  differ¬ 
entiate  different  ability  requirements.  If  tasks  usually  occurring  to¬ 
gether  requi  v  different  abilities,  it  is  desirable  to  write  the  tasks 
as  separate  atateaente  gather  than  on*  single  compound  statement.  This 
le  particularly  important  when  the  job,  such  as  firefighting,  involves 
a  mixture  of  diverse  abilities,  imaging  from  the  cognitive  abilities  to 
the  physloal  and  sensory-perceptional  skills  to  different  personality 
characteristics,  such  as  interests,  attitudes,  and  motivation. 

In  developing  a  task  inventory  for  test  development,  researchers 
must  be  careful  that  thee*  various  aspects  of  the  job  axe  adequately  and 
explicitly  reflected  among  the  tasks.  Despite  extensive  reviews  by 
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experts  knowled geeble  about  the  job,  Inventories  osn  still  oontsin 
distortions.  Ibis  in  not  to  fault  the  review era,  who  oan  generally 
be  relied  upon  to  perfore  the  almost  ispossible  task  of  determining 
what  taaks  are  missing  from  an  inventory  of  several  hundred  tasks. 
Bather,  the  problem  is  in  the  volatility  of  meaning  of  tho  English 
language.  An  initial  firefighting  inventory  oonslsting  of  task  state- 
aents  such  as  "Maneuver  ladders,"  "Operate  hose  lines,"  and  "Carry 
ventilating  fans, "  may  appear  to  be  a  suitable  description  of  the  fire¬ 
fighter  Job.  However,  these  tasks  may  not  sufficiently  reflect  the 
fact  that  these  tasks  require  KSAOs  over  and  above  mere  physloal 
strength  and  dexterity.  Reviewers  and  raters  may  be  well  aware  that 
maneuvering  ladders  and  hose  requires  reoall  of  numerous  procedural 
rules,  reasoned  interpretation  of  the  specific  emergency  situation  at 
band,  and  considerable  Judgment  in  making  decisions  about  where  and 
how  best  to  place  the  ladder,  hose  or  ventilating  fan.  However,  to 
assure  that  these  additional  aspects  of  task  statements  are  not  over¬ 
looked,  they  are  best  stated  explicitly  in  separate  task  statements, 
for  example,  ’'Maneuver  ladders,"  might  be  supplemented  with  tasks  suoh 
as  "Determine  stability  of  supporting  surfaces,"  "Understand  and 
follow  spoken  orders"  and  "Determine  type  and  sine  of  ladder  required." 

Since  an  entry  level  examination  is  generally  concerned  with 
the  abilities  required  to  acquire  tasks  as  well  as  the  abilities  re¬ 
quired  to  perform  it,  it  may  be  aimilarly  advantageous  to  explicitly 
include  tarks  to  reflect  these  requirements.  Tasks  concerned  with 
wading  the  manuals,  studying  the  fire  department  literature,  and 
performance  drills  help  allow  these  aspects  of  the  Job  to  bo  more 
explicitly  expressed  by  the  raters. 

This  specificity  and  focus  of  the  task  statements  in  a  test 
development  effort  has  the  immediate  effect  that  most  incumbents  will 
report  that  they  perform  a  relatively  large  number  of  the  tasks  in 
the  inventory.  Tami  this  occurs,  it  should  come  as  no  surprise  that 
even  the  most  significant  tasks  say  account  for  less  than  one  percent 
of  the  total  ratings  and  that  standardisation  of  raw  ratings  for  each 
individual  rater  becomes  less  important. 

The  zaters  themselves  are  also  not  typioal  of  respondents  to  a 
CODAP  inventory.  In  test  development,  the  zaters  are  the  relatively 
homogeneous  group  of  individuals  who  perform  the  Job  for  which  the 
test  is  being  developed.  Ibis  rater  homogeneity,  in  conjunction  with 
the  focus  of  the  inventory,  must  not  be  overlooked  in  the  inter¬ 
pretation  of  the  data.  Particularly  relevant  to  this  is  the  OVERLAP 
program. 


OVERLAP  clusters  raters  on  the  basis  of  the  similarities  of 
their  ratings.  Generally,  the  differences  between  major  clusters 
will  be  of  such  a  magnitude  that  a  different  Job  might  be  indloated. 
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However,  la  tsst  development,  the  focus  of  the  inventory  end  the 
homogeneity  of  rmters  la  such  that  clustering  may  well  be  on  the  basis 
of  differences  In  the  response  styles  of  the  raters,  rather  than  on 
actual  Job  differences. 

This  effect  was  best  Illustrated  in  a  study  of  claims 
authorlsers.  Because  claims  are  xmtdomly  to  authorlsors, 

there  must  be  assumed  to  be  few  differences  in  the  claims  processed 
by  different  authorlsers  over  a  year  's  time,  nevertheless,  OVERLAP 
showed  four  distinct  clusters.  Two  of  theme  reported  performing  a  very 
large  number  of  tasks,  while  the  other  two  reported  many  fewer  tasks. 

Interacting  with  this  effect,  two  groups  reported  performing 
the  more  unusual  aspects  of  claims  work,  while  the  other  two  groups 
restricted  their  ratings  to  the  more  conventional  claims  authorlser 
tasks.  Additional  follow-up  confirmed,  that  there  were  no  Job  differ¬ 
ences  and  that  OVERLAP  had  In  effect  clustered  raters  on  their  dif¬ 
ferent  tendencies  to  report  tasks  and  on  their  different  tendencies 
to  report  unusual  tasks. 

OODAP  provides  two  estimates  of  a  tasks 's  relative  significance. 
One  averages  the  ratings  only  over  those  who  perform  the  task,  the 
other  averages  over  all  raters.  The  second  method  is  the  most  useful 
for  test  development.  Although  some  extremely  crltloal  tasks  may  not 
be  considered  when  this  approach  Is  used,  It  must  be  reoalled  that  a 
selection  test  exist  be  designed  to  measure  the  generally  more  signifi¬ 
cant  aspects  of  the  Job.  It  oannet  be  designed  to  select  on  abilities 
that  bill  be  required  by  only  a  few  workers  performing  a  crltloal  task. 
The  value  of  a  teat  will  be  much  greater  when  It  selects  on  abilities 
required  to  perform  the  more  moderately  crltloal  tasks  performed  by 
nearly  all  workers  In  the  occupation. 

Although  the  relative  time  spent  scale  has  been  demonstrated 
to  be  a  generally  useful  lndloator  of  a  tasVs  Importance,  It  has 
certain  limitations  uhen  a  Job  ouch  as  fireflghtljm  is  considered. 

Por  firefighters,  the  most  Important  or  critical  tasks  are  often  not 
the  ones  Involving  the  most  time,  and  neither  of  these  scales 
necessarily  Identifies  the  tasks  that  bsst  dlffsrentiats  among  ths 
superior  mod  barely  acceptable  firefighter.  Por  this  reason,  ths 
tins  spent  scale  was  supplement'd  with  a  crltloallty  (importance) 
seals  and  a  difficulty  seals.  All  raters  completed  all  three  scales 
in  a  counterbalanced  design.  Tims  to  ooaplete  ths  inventory  was  two 
to  three  hours,  and  although  most  raters  (69Jt)  indicated  that  ths 
inventory  was  too  long,  many  acknowledged  ths  necessity  for  its 
length  and  oommsntsd  on  ita  comprehensiveness. 

Patlgue  or  changing  levels  of  motivation  ssessd  to  have  a 
minimi  effect  on  the  .stings,  since  there  were  no  significant  (p<.05) 
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differences  among  the  six  sequences  in  which  the  3  sosles  were  admin¬ 
istered.  Only  the  difficulty  soale  showed  *  aignlf leant  (p<.C5) 
decrease  in  ratings  as  the  order  of  adainl storing  the  scale  increased 
free  first  to  second  to  third,  but  this  effect  accounted  for  only  4.$ 
percent  of  the  total  variance. 

The  three  scales  used  contributed  uniquely  to  the  job  analysis. 
When  the  aean  ratings  (for  ssabers  performing)  on  each  soale  were 
calculated,  criticality  correlated  only  .40  with  tine  spent  and  .50 
with  difficulty,  while  time  spent  correlated  .10  with  difficulty. 
Despite  these  differences,  when  only  the  top  quarter  of  the  ranked 
tasks  on  each  scale  are  selected  using  this  procedure,  67  were  in  the 
top  quarter  on  all  three  soales,  another  >»  on  two  scales.  The 
unique  contribution  of  each  soale  consisted  of  four  tasks  for 
critioality,  five  tasks  for  tine  spent,  and  ten  tasks  for  difficulty. 

To  this  list  of  120  tasks  were  added  39  tasks  that  six  super¬ 
visors  (officers)  rated  in  the  top  quarter  on  critioality  for  the 
entry  level  firelighter  job.  This  supervisory  perspective  of  the 
entry  level  job  contributed  mostly  to  rescue  and  first  aid  type  tasks 
as  opposed  to  the  eaphasls  on  personal  safety  and  skill  acquisition 
type  tasks  of  the  entry  level  firefighter. 

The  aost  difficult  and  critloal  phase  in  the  use  of  the  CODA? 
job  description  is  the  link  from  the  task  statements  to  the  KSAOs 
required  to  acquire  and  perform  the  entry  level  job.  Ten  representa¬ 
tives  from  the  fire  department  and  four  psychologists  from  the  U.  S. 
Civil  Service  Commission  rated  the  importance  of  each  of  57  KSAOs  for 
acquiring  and  performing  the  159  meet  significant  entry  level  tasks. 

To  simplify  the  ratings,  raters  were  asked  to  indicate  if  a  KSAO  was 
important  for  predicting  performance  on  the  job  as  a  whole,  on  the 
duties,  and  on  the  individual  tasks i  and  if  so,  to  indicate  if  the 
KSAO  would  be  minimally  qualifying  as  a  screen-out  or  if  it  could  be 
used  to  differentiate  (rank)  applicants  with  different  amounts  or 
levels  of  tho  KSAO. 

Tables  1,  2,  and  3  show  how  psychologists  and  incumbents 
ranked  the  KSAO*  in  the  three  categories  of  abilities  used  in  the 
study.  The  similarity  between  the  psychologists'  and  incumbents' 
rank  ordering  of  KSAOs  is  considerable.  (Ranks  were  based  on  the 
frequency  with  which  the  ability  was  considered  to  be  a  ranking 
factor.)  However,  there  are  soae  interesting  differences.  Incumbents 
considered  "Following  orders'7  to  be  a  more  important  factor  than  the 
psychologists,  who  in  turn  placed  greater  emphasis  on  "Long  term 
memory. "  Having  relatively  limited  fire  scene  experience,  psycholo¬ 
gists  say  not  ordinarily  beoone  aware  of  the  extent  to  which  following 
orders  differentiates  levels  of  firefighting  performance,  while 
incumbents*  lower  ratings  of  long  t*r*  »w**ory  say  reflect  their 
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reluctance  to  dwell  on  something  oo  basic,  all  pervasive,  and  obvious.  | 

Dm  two  types  of  ratals  showed  alaoat  ooaploto  agreement  in  1 
ratings  of  tho  perceptual  and  physios!  XSAOs  (Tablo  2),  but  aanifsstod  I 
com  large  difference;  in  thoir  ratings  of  tho  miscellaneous  XSAOs  | 
(Tablo  3).  Of  tho  difforoncos  in  tho  overall  Job  ratings  of  aiscol-  | 
lanoous  abilities  for  oxasplo,  lacunbonts  considered  "Willingness  to  i 
work  shifts,"  "Interest  in  variety,"  "Ability  to  work  in  confined  | 
spaces,"  and  "Ability  to  drive,"  relatively  aore  important  than  the  1 
psychologists,  who  is  turn  rated  "Willingness  to  work  in  unpleasant  § 
situations"  and  "Willingness  to  follow  instructions,"  as  aore  iapor-  I 
tant.  Similar  differences  ocur  aaong  the  duty  ratings,  although  not  | 
always  in  the  ease  direction.  Abilities  rated  highly  important  for  | 
the  duties  say  receive  only  low  ratings  on  the  Job  overall.  1 


The  larger  number  of  differences  between  raters  on  these  XSAOs 
asy  be  attributed  in  part  to  the  less  clearly  defined  nature  of  these 
XSAOs  and  the  resultantly  lower  reliabilities  of  their  ratings.  These 
effects  are  additionally  aaplified  by  the  low  n  (n»4)  for  the  psycho¬ 
logists. 

Table  4  shows  that  psychologists  and  raters  agree  considerably 
in  rating  the  importance  of  the  XSAOs  by  tasks |  however,  there  are 
some  differences  and  psychologists  do  not  always  link  XSAOs  to  the 

earns  tasks  as  lncuabents. 


First,  psychologists  rated  basic  mechanloal  ability  high,  but 
lncuabents  rated  it  low.  On  the  other  hand  lncusbente  rated  "Ability 
to  use  staple  formulas"  high  ss  a  rankli*  factor  shore  psychologists 
focused  box*  on  basic  math  as  ths  ranking  factor. 


Table  5  shows  the  rater  similarity  as4  differences  between  the 
tank'’  linked  to  the  aore  important  cognitive  abilities.  Included  aaong 
the  larger  differences  are  ratings  on  "long  term  memory."  Psychologists 
rated  it  an  laportant  ranking  factor  for  29  tasks  as  opposed  to  8 
tasks  for  the  lncusbente.  lncuabents  focussed  on  firefighting  tasks, 
psychologists  on  first  aid,  special  eaergenclea,  and  training.  For 
ability  8,  "Quick  recall,"  psychologists  similarly  emphasised  the 
first  aid  type  tasks.  Psychologists  considered  ability  9  and  10, 
basic  math,  proportion  and  percents,  important  ranking  factors  for 
the  few  firefighting  tasks  that  involve  calculations,  while  lncuabents 
octaidered  "Ability  to  use  formulas"  as  the  critical  ranking  factor 
for  these  abilities. 

As  indicated  earl  ter,  psychologists  rated  mechanloal  ability  to 
be  important  for  more  tasks  than  the  raters.  "Ability  to  identify 
problems"  was  rated  highest  by  both  types  of  raters,  although 
psychologists  rated  it  more  important  for  aore  tasks,  mostly  emergency 
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first  aid  tasks  but  also  aose  maintenance  tasks.  Induction  (ability  16) 
sas  siailarly  emphasised  by  psychologists,  although  for  the  first  tiae 
it  included  a  substantial  nuaber  of  basic  firefighting  tasks  (Duty  C). 
The  aaae  is  true  for  deduction  (ability  17)  but  on  a  auch  smaller 

socle. 

Judgaent  (ability  19)  sas  rated  high  by  both  groups  but  on  the 
basis  of  aliiost  entirely  different  sets  of  tasks.  Incuabents  uniquely 
linked  judgaent  to  tasks  involving  "recognising, "  "determining,”  and 
"locating.”  Psychologists  linked  it  to  more  specific  tasks  involving 
roaoving  or  leading  persons,  safeguarding  property  and  eliminating 
unsafe  conditions . 

It  is  possible  to  theorise  ad  infinitum  about  these  differences, 
but  such  theorising  should  be  teapezed  with  the  consideration  that  the 
psychologist  ratings  are  derived  froa  only  four  persons  and  that  the 
criteria  for  selecting  linked  tasks  to  be  counted  required  agreement 
among  3  of  the  4  psychologists  as  opposed  to  5  out  of  the  10  incuabents. 
Nevertheless  it  is  important  to  be  aware  of  soae  possible  differences 
between  psychologists  and  incuabents'  ratings,  even  though  their 
ratings  will  generally  agree. 

For  the  entry  level  firefighter  project,  only  the  incuabents' 
ratings  were  used.  Cognitive  abilities  selected  for  testing  were 
identified  froa  those  that  were  rated  to  differentiate  performance  on 
a  task-by-task  basis  and  on  a  duty  basis. 

Both  ratings  are  important.  The  task  ratings  link  the  abilities 
to  their  relevant  tasks  and  thereby  help  to  reveal  their  Meaning. 
Examination  of  the  tasks  that  require  each  ability  oan  also  help  to 
identify  the  level  at  which  the  ability  might  be  required.  The  duty 
ratings  allow  for  an  expression  of  the  Importance  of  such  abilities 
as  "endurance"  which  would  not  be  manifested  in  task  ratings  alone. 

This  paper  has  focused  primarily  on  the  identification  of 
ranking  factors,  especially  the  cognitive  ranking  factors.  The 
development  of  the  entry  level  examination  will  involve  a  parallel 
analysis  of  those  factors  that  oan  more  properly  be  used  as  a  screeaout 
in  a  total  examination  process. 
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IMPLEMENTATION  OF  THE  CURRENT  TASK  INVENTORY  8ANK  (CTIB)  PROGRAM 

by 


Gerald  R.  Clow 
Occupational  Survey  Branch 
USAF  Occupational  Measurement  Center 
Lackland  AFB  TX 


Introduction;  The  goal  of  the  CTIB  program  is  to  achieve  and  maintain  a 
current  job  inventory  on  every  enlisted  occupation  in  the  Air  Fo'ce. 

This  includes,  by  definition,  every  career  ladder,  shred,  and  special 
duty  identifier  (SOI)  in  existence.  The  short  term  goal  is  to  achieve  a 
state  «f  currency  on  all  existing  job  inventories  by  April  1978.  At 
last  count  this  was  a  total  of  two  hundred  forty-seven  (247)  ladders/shreds/ 
SOIs.  After  April  1978  phs  goal  will  be  to  systematically  review  existing 
job  inventories  and  to  develop  inventories  for  those  occupations  that 
have  not  been  surveyed. 

Inventory  Development  History:  Traditionally,  It  has  taken  47  weeks  to 
complete  an  occupational  survey  of  a  single  career  ladder  barring  any 
unforseen  circumstances.  Until  eighteen  (18)  months  ago,  eighteen  (18) 
weeks  of  thai  time  was  consumed  by  the  Inventory  development  process. 

Last  year  development  time  was  reduced  to  thirteen  (13)  weeks.  Table  1 
contains  weekly  phase  points  of  the  inventory  development  process  which 
existed  fron  <967  until  1976.  Table  2  shows  a  reduced  Inventory  development 
schedule  implemented  in  1976.  As  you  can  see  in  Table  1,  six  (6)  weeks 
of  development  time  was  consumed  by  the  reproducing  and  mailing  of  field 
reviews. 

It  was  decided  in  1976  that  the  only  way  to  reduce  occupational 
survey  time  would  be  to  discontinue,  on  a  routine  basis,  the  administration 
of  field  reviews.  A  few  specialists  and  I  had  long  questioned  the  value 
of  the  field  review  as  a  significant  aid  to  the  development  process.  In 
my  opinion,  field  reviews  did  not  provide  any  significant  changes  to  an 
inventory  which  was  developed  by  the  interview  procedure.  The  usefulness 
of  the  field  review  was  not  resolved  at  this  time.  However,  since  most 
of  the  inventory  development  effort  was  now  involved  with  occupations 
which  had  been  surveyed,  field  reviews  were  judged  as  not  needed. 

Write-ins  from  booklets  were  being  kept  and  were  made  available  to 
developers.  Field  reviews  would  only  be  performed  on  a  selective  basis. 

Consequently,  field  reviews  were  discontinued  as  a  routine  procedure 
and  inventory  development  time  was  reduced  by  five  weeks.  (One  week  was 
added  to  the  six  saved  because  of  scanner  formatting.)  The  inventory 
development  process  t line  was  thereby  established  as  thirteen  (13)  weeks. 
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Thirteen  {13}  weeks  was  still  unsatisfactory.  Projects  were  being 
accomplished  to  meet  a  goal  of  performing  occupational  surveys  of  fifty- 
one  (51)  career  ladders  a  year,  and  performing  a  resurvey  on  each  career 
ladder  every  four  (4)  years.  By  1975,  changes  in  the  classification 
structure  were  numerous.  User  priorities  were  Increasing.  It  was 
impossible  to  schedule  fifty-one  (51)  career  ladders  a  year  and  be 
responsive  to  user  demands.  Out  of  this  dilemma,  the  CTIB  concept  was 
created. 

CTI8  Background:  In  October  of  1976,  Captain  Tom  Ulrich  ano  Chief 
Jim  Moon  were  assigned  the  responsibility  of  establishing  operational 
criterior,  for  implementing  CTIB.  They  provided  three  (3)  criteria  for 
evaluating  job  inventories  as  stated  in  Tables  3,  4,  and  5.  These  still 
serve  as  a  guide  for  making  CTIB  judgments. 

I  initially  assigned  functional  areas,  containing  a  number  of 
occupations,  to  inventory  developers  for  accomplishment  of  an  initial 
CTIB  review  of  existing  job  inventories.  This  review  was  to  ascertain 
currency,  quality,  and  priority  of  inventories.  This  method  did  not 
work.  The  experience  level  of  some  developers  was  not  sufficient  to 
make  judgments  about  quality.  It  was  also  counter-productive  for  developers 
to  shift  their  attention  between  CTIB  and  development  work.  Managerial 
control  was  Impossible  because  of  Inventory  development  travel.  Therefore, 

I  decided  that  a  separate  type  of  position  needed  to  be  created,  one 
tnat  would  be  responsible  for  managing  a  functional  area  of  Air  force 
occupations  for  purposes  of  CTIB. 

CTIB  Implementation;  Three  (3)  CTIB  manager  positions  were  created 
early  this  year.  Their  function  Is  to  achieve  and  maintain  a  knowledge 
of  a  number  of  occupations.  They  are  responsible  for  assessing  the 
quality  and  currency  of  existing  job  inventories  in  their  assignment. 

They  must  also  ascertain  user  priorities  for  occupational  surveys  within 
their  assignment. 

CTIB  managers  accomplish  their  function  by  maintaining  telephone 
contacts  with  training,  classification,  functional  and  operational 
personnel.  They  also  review  classification  and  training  documents. 

However,  the  most  important  Information  to  the  CTIB  manager  Is  through 
the  process  of  an  "interview  review"  by  an  inventory  developer.  This 
consists  of  an  interview  with  subject-matter  specialists  at  either  a 
Technical  Training  Center  or  a  field  location  whereby  the  entire  job 
inventory  is  reviewed  in  detail.  A  determination  is  made  after  this 
review  as  to  the  quality  and  currency  of  a  job  Inventory,  Two  or  three 
inventory  developers  are  assigned  to  each  CTIB  .nanager,  thereby  constituting 
a  team.  I  believe  this  will,  in  time,  provide  a  continuity  of  knowledge 
about  a  grouping  of  occupations  that  is  essential  to  CTIB. 
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In  Summary,  three  CTIB  managers  and  their  teams  are  reviewing 
existing  job  inventories  for  quality  and  currency.  This  review,  and 
necessary  updating  of  job  inventories,  should  be  accomplished  by  April  1978. 
At  that  time,  criterion  for  maintenance  of  CTIB  will  be  established. 
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EIGHTEEN  (18)  WEEK  INVENTORY  DEVELOPMENT  CYCLE 


WFFK 

1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 


ACTION 

Research 

1st  TDY  (Technical  Training  Center) 

Revise  Draft 

Type  Draft 

Field  TDY 

Field  TDY 

Revise  Draft 

Type  Field  Review 

Reproduce  FieV.  Review 

Reproduce  Field  Review 

Mail  Field  Review 

Mail  Field  Review 

Mail  Field  Review 

Mail  Field  Review 

Finalize  Job  Inventory 

Finalize  Job  Inventory 

Type  Job  Inventory 

Send  to  Printer 


TABLE  2 

THIRTEEN  (13)  WEEK  INVENTORY  DEVELOPMENT  CYCLE 


WEEK  ACTION 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 


Research 

1st  TDY  (Technical  Training  Center) 

Revise  Draft 

Type  Draft 

Field  TDY 

Field  TDY 

Finalize  Job  Inventory 
Finalize  Job  Inventory 
Type  Job  Inventory 
Type  Job  Inventory 
Editorial  Review 
Editorial  Review 
Send  to  Printer 
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TABLE  3 

CRITERIA  FOR  PRIORITIZING  ASSIGNED  CAREER  LADDERS 


WHEN  TO  REVISE  JOB  INVENTORY 
No  Cxistinq  OSR 
Previous  Inventory  incomplete 


Large  number  of  people  in  career 
field 


Previous  inventory  did  not 
differentiate  between  levels  or 
shreds 

Recent  major  revision  to  CL  (AFSC, 
shreds  etc.)  ^ 

Tech  school.  ATC  (TT  or  SG),  AFMPC/ 
DPMPRQ  or  USAF  strongly  urges  revision 

When  Inventory  is  not  adequate  or 
current 


When  equipment  is  aided  or  deleted 
from  the  field 

When  a  process  changes  (e.g.  missile 
set  up  on  site  vs  at  factory) 

When  a  change  in  technology  adds  or 
deletes  tasks 

When  changes  to  tasks  require 
i  reorganization  of  the  inventory 

,  When  a  large  number  of  tasks  are  also 

j  being  performed  by  another  career 

I  field  or  ladder 

i 

|  Analysis  Indicates  need  for  change 

i  ! 


WHEN  NO!  TO  REVISE  JOB  INVENTORY 


No  changes  to  equipment 

Previous  invert tory/OSR  complete  and 
comprehensive 

Recently  revised 


Small  number  of  people  in  career  field 


No  classification  changes  to  career 

field  or  ladder  l 

Within  6  months  of  the  administration  j 

of  the  inventory 

When  the  field  is  a  direct  equivalent 

for  a  civilian  occupation  (such  as  522X4,  ; 

Protective  Coating  Specialist  =  Painter)  ; 

and  no  change  (or  only  minor  changes) 

has  occurred  in  the  methods,  materials,  , 

or  processes  in  the  field  since  the  last 

inventory 
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TABLE  3  (CONTINUED) 

CRITERIA  FOR  PRIORITIZING  ASSIGNED  CAREER  LADDERS 

WHEN  TO  REVISE  JOB  INVENTORY  WHEN  NOT  TO  REVISE  JOB  INVENTORY 

No  revision  for  more  than  _  years  When  information  sources  (ATC,  HQ,  MPL) 

indicate  no  changes  to  field  anticipated 

Multiple  ladders  in  one  inventory  When  no  new  tasks  have  been  created  for 

old  equipment 

When  no  new  technology  has  been  added 
to  t're  career  field 


When  there  is  no  overlap  between  career 
field  and  another  career  field  in  the 
task  list 


TABLE  4 


CRITERIA  FOR  ESTABLISHING  WHETHER  OR  NOT  A  JOB  INVENTORY 
IS  ADEQUATE  AND  CURRENT 


ITEMS,  EVENTS  OR  DOCUMENTATION  WHICH 
INDICATE  A  JOB  INVENTORY  IS  ADEQUATE 
OR  CURRENT _ 

All  Items  In  39-1,  STS,  POI  are 
Included  In  Inventory 

No  new  equipment  or  aircraft  since 
last  Inventory 

No  equipment  or  aircraft  deleted 
since  last  inventory 

Data  from  last  Inventory  indicates 
that  tasks  were  broken  down  to 
appropriate  level 

Structure  of  CL  and  assignment 
locations  have  remained  constant 

No  request  for  update  from  the 
field 

Held  evaluation  by  tech  school 
which  indicates  training  is  in  synch 
with  job  requirements 

A  conference  of  major  command 
career  monitors  in  which  no 
training  problems  or  classification 
problems  are  noted 

All  new  equipment  tasks  are  included 


All  new  technology  tasks  are 
included 

All  overlapping  tasks  are  clearly 
re- stated 

No  tasks  are  included  in  inventory 
that  are  really  done  by  another 
A  ESC 


ITEMS,  EVENTS  OR  DOCUMENTATION  WHICH 
INDICATED  A  JOB  INVENTORY  IS  NOT  ADEQUATE 
OR  CURRENT _ _ 

Items  in  39-1,  STS  POI  are  not  included 


New  equipment  or  aircraft 


Classification  or  training  change  since 
last  OSR 

Data  from  OSR  indicates  that  tasks  were 
not  specific  enough,  data  wouldn't  provide 
enough  info  to  make  management  decisions 

Small  group  survey  determines  need  for 
revision 

Proposed  career  field  structure  change 
letters  from  MPC 

Field  evaluation  at  tech  scfc'h  indicate 
training  is  out  of  synch  with  job  require¬ 
ments 

A  conference  of  major  command  career 
monitors  in  which  problems  concerning 
training  or  classification  are  surfaced 


New  equipment  or  technology  tasks  arc  not 
included 


Tasks  overlap 


Another  AFSC's  tasks  are  included 


TABLE  4  (CONTINUED) 


CRITERIA  FOR  ESTABLISHING  WHETHER  OR  NOT  A  JOB  INVENTORY 
IS  ADEQUATE  AND  CURRENT 


ITEMS,  EVENTS  OR  DOCUMENTATION  WHICH 
INDICATE  A  JOB  INVENTORY  IS  ADEQUATE 
OR  CURRENT _ 

Unique  tasks  performed  by  that  AFSC 
are  included  in  the  inventory 

Task  list  is  well  organized  and 
logical 

Manpower  survey  tasks  are  all 
represented 

SMS  review  surfaces  no  additions 
or  deletions 

All  data  sources  agree  'inventory  is 
adequate  (AFMPC,  ATC,  etc.) 

No  significant  write-ins  during 
administration 

Data  indicates  all  tasks 
performed 


ITEMS,  EVENTS  OR  DOCUMENTATION  WHICH 
INDICATED  A  JOB  INVENTORY  IS  NOT  ADEQUATE 
OR  CURRENT _ 

None  of  the  unique  tasks  performed  by  that 
AFSC  (if  any)  are  Included 

Task  list  is  disorganized  and  illogical 


Previous  survey  shows  over  10%  tasks 
performed  by  at  least  60%  of  CL 

If  none  exists  and  other  indicators  oolnt 
to  a  need  to  develop  one 

Multiple  ladders  in  one  task  list 


Large  number  of  write-ins  during 
administration 

Data  IrJicates  many  tasks  not  performed 
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TABLE  5 


CRITERIA  FOR  ESTABLISHING  WHETHER  OR  NOT  A  CAREER  LADDER  HAS  CHANGED 
ENOUGH  TO  WARRANT  REVISING  THE  JOB  INVENTORY 


EVENTS  INDICAT ING  A  REVISION  IS  WARRANTED 

New  or  revised  STS,  39-1,  add  or  delete 
duties  or  equipment 

Population  of  CL  has  significantly 
increased  or  decreased 

Assignnment  locations,  major  using 
commands  have  changed 

Request  from  Tech  School,  ATC,  and/or 
HQAF 

Tasks  are  being  added  to  or  deleted 
from  career  field  as  a  result  of  new 
equipment  technology 

Unstable  career  field,  has  not  been 
surveyed  for  four  years 

Inventory  is  disorganized  because 
changes  in  procedures  occurred  or  new 
tasks  have  been  added 

New  safety  standards  (e.g.  2  men 
performing  a  job  formerly  performed 
by  a  single  person). 

Change  in  levels  of  performance  on  STS 

Change  in  number  of  hours  in  blocks  or 
topics  on  course  chart 

A  new  AFSC  structure  is  a  results  of  adding 
or  deleting  shreouuts,  weapons  systems 
or  equipment 

Radical  changes  In  methodology  of 
performance 

CDC  rewritten 


EVENTS  INDICATING  A  REVISION  IS  NOT 
_ WARRANTED _ 

Current  STS  and  39-1  are  adequately 
covered  in  previous  inventory 

Population  remains  stable 
quantitatively  and  qualitatively 

Assignment  locations  and  major 
using  commands  unchanged 

No  classification  changes  since 
last  survey 

No  changes  in  course  documents  for 
a  long  time 


Stability  in  terms  of  equipment, 
manning  and  technology 

Same  weapons  systems  remain  in 
field  to  be  worked  on  as  on  last 
survey 

Survey  recently  completed 


CDC  nas  remained  stable 

STS  changes  in  proficiency  levels 
only 


Change  in  AFSC  structure  which  creates  a 
new  AFSC 

Convention  of  many  authorizations  from 
military  to  civilian  or  vice  versa 


An  Innovation  in  Identifying 
Air  Force  Qualitat've  Training  Requirements 

HENDRICK  W.  RUCK 
MICHAEL  W.  BIRDLEBOUGH 

Under  the  present  Air  Force  classification,  training  and 
assignment  policies,  first  term  airmen  are  normally  trained  to 
be  universally  assignable  within  their  specialty.  This  need 
for  universally  assignable  personnel  places  a  requirement  on 
the  technical  training  eyslem  to  provide  broad-based  training 
across  the  majority  of  jobs  which  may  be  performed  by  course 
graduates.  Job  specific  training  is  then  provided  by  the 
gaining  unit  tnrough  locally  developed  OJT  programs.  This 
approach  to  meeting  training  requirements  may  result  in  higher 
costs  than  training  which  is  oriented  toward  a  narrower  range 
of  particular  jobs. 

Pressure  to  reduce  resources  --  instructors,  students, 
and  support  —  tied  up  in  training  has  led  the  Air  Force  to 
reevaluate  the  necessity  for  universally  assignable  airmen. 
With  the  prospect  of  increasing  assignment  stability,  the  no¬ 
tion  that  airmen  in  their  f irst  enl  istment  should  be  trained 
for  assignment  to  any  job  in  their  specialty  has  been  ques¬ 
tioned.  That  is,  narrowing  the  training  target  for  first 
assignments  by  training  to  specific  joDs  or  families  of  jobs 
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is  being  examined  and  tested  as  a  viable  alternative  to  current  training 
and  assignment  patterns.  Such  targeting  requires  a  clear  picture  of 
current  utilization  patterns.  (The  utilization  pattern  Is  the  number 
and  nature  of  different  jobs  within  the  specialty  as  well  as  the  relative 
importance  of  and  relationships  among  those  jobs.  Its  key  dimensions 
Include  numbers  of  personnel  assigned,  geographic/command  distribution, 
progression,  etc.)  Occupational  survey  data  are  a  prime  source  of 
information  about  existing  utilization  patterns. 

A  clearly  defined  outline  of  the  utilization  pattern  yields  payoffs 
for  managers  and  trainers  alike.  It  allows  for  the  review  of  jobs 
performed  In  the  Air  Force  as  a  whole  as  well  as  within  specific 
commands.  Of  particular  Interest  are  the  jobs  to  which  first  termers 
are  usually  assigned.  Such  Information  can  bo  very  useful  to  personnel 
who  manage  various  specialties,  and  permits  development  of  well-informed 
pians  for  future  utilization.  Once  future  utilization  has  been  planned, 
the  training  required  to  support  the  projected  utilization  pattern  may 
be  developed. 

The  concept  of  utilization  oriented  training  requires  that  training 
be  offered  at  appropriate  c<reer  points  and  using  various  training  modes 
(such  as  resident  courses,  nn-the-job  training  (QJT),  and  field  training! 
to  insure  optimum  pay-off  in  training  actually  applied  to  mission  requirements. 
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Within  each  training  mode,  the  full  range  of  available  in¬ 
structional  strategies  can  be  employed.  The  focus  of  utiliza¬ 
tion  oriented  training  development  within  the  Air  Force  has 
been  on  first  enlistment  airmen.  However,  changes  in  first 
enlistment  utilization  and  training  require  adjustment  of 
utilization  and  training  patterns  for  the  career  force  as 
well . 

This  paper  has  been  quite  general  to  this  point,  it  may 
now  be  useful  to  discuss  specific  mechanisms  which  may  be  used 
for  developing  utilization  oriented  training.  Several  steps 
are  required  in  the  development  of  utilization  oriented  train¬ 
ing  strategies. 

First,  the  present  utilization  patterns  must  be  described 
and  understood  by  decision  makers.  One  proven  method  for  ac¬ 
complishing  this  first  step  is  to  assemble  a  small  working 
group  consisting  of  technicians  and  managers  who  are  respon¬ 
sible  for  effectively  utilizing  personnel  in  the  specialty. 
This  group  intensively  reviews  and  synthesizes  available  job 
data  to  gain  an  understanding  of  present  utilization.  Al¬ 
though  other  sources  may  be  used,  occupational  survey  data 
have  effectively  served  as  the  baseline  for  informed  decision 
making.  The  working  group  considers  th,^  ramif icac ions  of  that 
pattern  and  develops  alternative  plans  for  utilization  of  per¬ 
sonnel  in  the  specialty  based  upon  their  findings.  Alterna¬ 
tives  frequently  suggested  include  subdivision  of  the  3-level 


1184 


into  various  channels  by  weapon  system,  workcenter,  or  command 
of  assignment.  The  results  of  this  working  group  meeting 
should  then  be  fully  staffed  through  the  personnel,  training, 
and  operational  agencies  to  solicit  comments  as  well  as  con¬ 
siderations  not  addressed  earlier. 

Next,  a  policy-making  group  comprised  of  Air  Staff,  opera¬ 
tional,  personnel,  and  training  managers  as  well  as  technical 
experts  and  members  of  the  original  working  group  should  meet 
to  finalize  tha  utilization  plan.  This  group  has  the  benefit 
of  the  original  working  group  minutes  as  well  as  comments  from 
the  staffing  process.  It  establishes  the  utilization  plan  and 
develops  an  action  plan  for  its  implementation.  Changes  to 
existing  personnel  and  training  programs  are  identified. 

The  results  of  this  meeting  are  staffed,  preliminary  cost  es¬ 
timates  are  developed,  and  implementing  actions  are  initiated. 
The  actual  implementation  is  an  iterative  process,  with  ad¬ 
justments  to  the  milestone  schedule  being  made  as  issues 
arise.  Prior  to  final  approval,  refined  cost  estimates  are 
developed  which  incorporate  modifications  to  the  initial  plan. 

The  steps  outlined  above  obviously  result  in  more  than 
just  the  determination  of  training  requirements.  In  fact 
differences  between  this  procedure  and  the  existing  system 
can  bring  about  several  significant  opportunities.  For  ex¬ 
ample,  the  new  procedure  allows  operational  managers  co  re¬ 
design  jobs  or  gain  insight  into  present  jobs,  which  may  allow 
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for  less  demand  on  the  training  system  while  not  seriously 
affecting  accomplishment  of  the  operational  mission.  The  pro¬ 
cedure  also  allows  operational  managers  to  review  scientif¬ 
ically  derived  job  data  for  the  whole  Air  Force  as  well  as 

! 

their  own  commands  --  a  capability  not  normally  exercised  in  | 

the  Air  Force.  Although  training  requirements  are  devised 
based  on  operational  requirements,  the  procedure  allows  for 
operational  changes  which  may  provide  for  more  cost  effective 
tra  in  ing . 

The  procedure  described  in  this  paper  is  not  official  Air 
Force  policy.  It  is  experimental.  It  has  been  applied  in 
three  diverse  specialties  and  may  be  applied  to  other  special¬ 
ties.  Not  all  specialties  are  appropriate  for  such  intense 
consideration.  Those  specialties  which  have  been  analyzed 
using  this  procedure  have  been  high  training  cost  specialties. 

It  appears  that  beneficial  changes  will  occur  in  training  for 
all  three  specialties,  although  final  outcomes  are  difficult 
to  predict.  Finally,  the  procedure  is  not  set  in  concrete. 

It  has  evolved  over  the  course  of  18  months  and  will  continue 
to  be  adaptable  to  unique  situations. 
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Small  Sample  Studies  to  Assess 
Career  Field  Changes 


Capt  David  S.  Street 
Capt  Douglas  Gorman 
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ABSTRACT 

The  Air  Farce  Occu^tional  Survey  program  has  for  some  time  been 
collecting  stratified  random  survey  samples  which  in  many  cases  approach 
the  actual  number  of  cases  in  a  o?reer  field  population.  The  current 
study  is  planned  to  assess  the  relative  benefits  of  administering 
smaller  samples  of  a  career  field  survey  as  a  first  step  in  devel¬ 
oping  more  quantitative  criteria  to  determine  when  a  career  field 
structure  and  jobs  within  a  career  field  might  be  changing.  Since 
the  data  analysis  is  not  yet  completed,  this  paper  is  presented  in 
the  form  of  a  description  of  some  of  the  problems  Initially  encountered 
as  well  as  some  of  the  issues  considered  in  examining  smaller  sample 
surveys  for  cluster  analysis. 
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Introduction 

The  Air  Force  has  for  several  years  been  collecting  occupational  data 
based  on  survey  procedures  where  a  large  percentage  of  the  career  field 
population  has  been  sampled .  In  order  to  maintain  current  data  on  career 
fields  and  survey  all  career  fields  comprehensively,  the  USAF  Occupational 
Measurement  Center  has  been  surveying  approximately  fifty-one  career  fields 
per  year  which  allows  for  about  one  survey  per  career  field  every  four 
years.  Most  jobs  m  career  fields  and  most  career  field  structures  are 
found  to  be  relatively  stable  with  only  minor  changes  over  a  four  year 
period.  In  most  cases  it  seems  that  the  data  provided  in  Occupational 
Survey  Reports  (GSR)  have  proven  not  to  be  perishable  for  at  least  3  years 
and  sometimes  for  much  longer  periods  where  a  career  field  remains  rela¬ 
tively  unchanged.  Although  this  may  be  a  good  overall  procedure  of  up¬ 
dating  data,  it  might  be  possible  to  more  selectively  pinpoint  needs  for 
data  update  through  the  development  of  decision  criteria  on  data  base 
obsolescence.  The  U3e  of  the  small  sample  study  is  one  step  in  developing 
some  idea  of  the  feasability  of  such  a  technique. 


) 

I 


\ 


Problem!  Change  in  Career  Field  Data 

Career  fields  can  change  in  two  primary  ways*  (l)  by  adding  or  deleting 
tasks  performed  or  (2)  by  changing  the  amount  of  tims  different  people 
expend  in  given  tasks.  Both  of  these  may  constitute  changes  in  career 
field  structure  in  relation  to  Air  Force  management  procedures.  With  the 
advent  of  the  Current  Task  Inventory  Bank  (CTIB)  procedure  at  the  USAF  Occu¬ 
pational  Measurement  Center  it  has  become  possible  to  update  task  inventories 
systematically  on  a  realtime  basis.  What  this  means  in  terms  of  career 
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field  ehar.ge  is  that  tasks  may  now  be  added  to  the  survey  instruments  or 
identified  as  possibly  no  longer  performed.  This  gives  the  potential  of 
having  any  inventory  current  for  administration  at  any  point  in  time. 

Once  "significant"  changes  have  occurred  in  the  tasks  or  ir  who  performs 
tasks  in  a  career  field  or  between  career  fields,  then  the  CSR  data  might 
require  updating.  If  the  change  is  reasonably  minor,  it  might  be  possible 
to  "correct"  the  current  data  on  the  basis  of  a  more  limited  sampling  tech¬ 
nique.  If  such  a  procedure  could  be  developed,  then  it  might  be  possible 
that  a  system  more  sensitive  to  career  field  change  could  be  adooted  to 
update  occupational  data  in  the  Air  Force. 


) 

Method  of  Approach  j 

In  examining  the  potential  of  a  more  limited  sampling  size  for  updating  ; 

survey  data  one  night  initially  suspect  that  a  statistical  solution  could 
be  calculated  to  determine  what  would  be  "scientifically"  acceptable.  How¬ 
ever,  when  one  examines  the  clustering  analysis  technique  1  which  is  so 
useful  in  this  type  of  job  analysis,  then  it  becomes  clear  that  although  a 
statistical  solution  might  be  developed,  it  will  probably  be  complex. 

This  technique  might  be  thought  of  graphically  as  taking  in  more  and  more 
bits  of  information  in  a  photograph  or  a  jigsaw  puzzle,  the  more  information 
you  have,  the  more  recognizable  the  Picture  become?.  The  computer  with 
the  Comprehensive  (X:cupational  Data  Analysis  Programs  (CODAP)  will  cluster 
the  job  incumbents  together,  but  it  is  up  to  the  occupational  analyst  to 
be  able  to  see  the  picture  and  interpret  the  ’’picture"  and  meaning  to  the 
manager  who  will,  jsg  the  data.  Therefore,  although  some  criteria  might 
be  arranged  for  *ne  decision  of  meaningful  job  groups,  the  analyst  is 
placed  in  the  position  of  re-evaluating  the  criteria  as  they  might  be 


applied  to  any  given  career  field,  sex  of  career  fields,  or  possible  data 
utilization . 

Tn  approaching  tho  problem  of  determining  when  a  career  field  might 
be  "si^rnificantly"  changing,  two  elements  of  interest  would  be  the  tasks 
added  or  deleted  and  what  may  be  called  the  structure  of  the  career  field 
which  relates  to  the  relative  amount  of  time  spent  by  job  incumbents  per¬ 
forming  any  given  task.  Task  additions  or  deletions  might  alter  a  task 
inventory  to  an  extent,  but  in  terms  of  perishability  of  the  data  the  real 
concern  is  how  the  career  field  is  organized  into  job  type  or  cluster  groups 
and  what  tasks  these  groups  may  be  said  to  perform  most  often.  Also  a 
career  field  might  change  significantly  through  a  management  decision  whore 
the  same  tasks  were  redistributed  in  terms  of  groups  performing  a  set  of 
tasks.  Obviously,  any  one  of  these  kinds  of  changes  might  have  an  impact 
or  training,  classification,  or  assignment  practices  and  is  vital  informa¬ 
tion  to  effective  management.  A  measure  of  change  ir.  career  field  "strut t'ire" 
would  bo  of  primary  Interest.  Career  field  structure  or  job  structure 
might  be  defined  as  the  number  and  relationships  of  jobs  in  a  given 
career  field  and  the  number  of  job  incumbents  related  to  any  given  group¬ 
ing. 

The  Air  Force  Human  Resources  Laboratory's  CODAP  system  prints  out  an 
occupational  structure  in  the  form  of  a  cluster  merger  diagram  along  with 
a  series  of  computer  arrayed  data  products  which  allow  an  occupations, 
analyst  to  interpret  the  data  received  in  an  Air  Force  Occupational  Sur¬ 
vey.  Normally,  those  programs  are  used  to  compare  job  types  within  a 
career  field  or  between  career  fields  wtosn  two  or  more  career  fields  have  been 
surveyed  in  the  sane  survey  instrument.  However,  the  CODAP  system  is  a 
highly  flexible  series  of  programs  and  the  small  sampling  project  might  use 
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it  to  advantage  in  systematically  comparing  different  job  structures  across 
several  different  samples. 

The  question  of  "significant"  change  in  relation  to  job  structure  must 
always  be  cor.sidrc-red  within  the  context  of  the  relative  management  poli¬ 
cies  and  concerns  of  the  job  environment.  Many  items  may  be  considered 
in  terms  of  significance  such  as  worker  satisfaction,  mission  effectiveness, 
training  costs,  classification  limitations,  assignment  system  adaptability, 
recruiting  policies,  and  a  host  of  other  possibilities.  Here  significant 
change  is  tied  back  into  a  more  socially  defined  term  that  may  relate  to 
the  purpose  that  the  usor  might  make  of  the  data.  It  seems  safe  to  as¬ 
sume  that  no  simple  definition  of  significance  will  suffice  in  this  type 
of  data  leodback  emironment.  A  change  in  any  portion  of  the  behavioral 
"picture"  of  a  career  field  might  or  might  not  effect  the  management  per¬ 
ception  of  the  whole,  and  the  viewpoint  of  "significance"  might  be  judged 
in  relation  to  the  aspects  individually  ar  well  as  a/nergistically. 

Focusing  on  a  measure  of  significant  change  In  career  field  structure 
would  seem,  at  least  Initially,  to  require  a  significant  amount  of  qual¬ 
itative  comparison  in  order  tc-  tease  out  as  many  relevant  aspects  of 
the  situation  as  possible.  The  question  of  whether  or  not  a  career  field 
structure  has  changed  must  always  be  defined  ultimately  in  relation  to 
who  might  be  concerned,  It  might  be  that  for  certain  data  users  broad  in¬ 
formation  with  significant  detail  would  be  required  and  a  more  comprehen¬ 
sive  and  updated  survey  might  be  necessary.  For  another  data  user,  a 
limited  set  of  job  groups  sdght  be  of  interest.  For  still  another  data 
user  general  information  might  be  all  that  was  required  and  whatever  changes 
that  might  be  occurring  night  not  be  relevant  to  that  level  of  management 
questioning.  Each  request  might  require  a  different  level  of  need  for  data 
currency.  Thus  a  decision  to  resurvey  a  field  could  be  approached  from 
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several  viewpoints.  If  a  small  sample  could  be  relied  upon  as  being  sensi¬ 
tive  to  a  certain  degree  of  differentiation,  existing  data  might  be 
revalidated  for  even  detailed  questions  or  issues. 

The  initial  question  might  then  be  posed  as  "How  sensitive  is  a  smaller, 
more  limited  survev  sample,  in  reflecting  the  structure  of  a  career  field?" 
Another  way  looking  at  this  question  would  be  to  ask  "How  much  Informa¬ 
tion  and  what  kind  of  information  i  ,  lost  in  relation  to  career  field 
structure  when  a  sample  size  is  diminshed?"  In  beginning  to  develop  tech¬ 
niques  for  dealing  with  now  often  a  survey  would  need  re-admlnlstratlon, 
and  if  smallor  than  100  percent  samples  are  to  be  used,  this  fl~st  step 
of  evaluating  the  sensitivity  of  sample  size  seems  like  a  useful  starting 
point. 

Plan  of  Hosoarch 

In  order  to  investigate  possible  information  loss  in  smaller  samples 
a  gradatod  serios  of  occupational  analyses  will  be  performed.  Initially 
one  data  sot  was  selected  Prom  available  career  fields  currently  being 
analyzed-  Prom  this  data  file  four  separate  random  samples  will  be  drawn 
2%  and  10$  of  the  t^tal  original  survey  cases).  Ea~h  of  those 
samples  will  then  be  run  as  an  independent  cluster  analysis  by  the  CODAP 
system.  Since  the  randomized  sampling  will  be  drawn  independently  from 
the  total  survey  case  file,  there  should  bo  some  degree  of  repetition  of 
cases  within  the  four  samples.  After  all  the  studies  have  been  job  typed, 
comparisons  of  the  clusters  pcrcss  the  four  samples  and  with  the  original 
survey  clustering  structure  will  be  made.  Comparisons  will  then  bo  made 
between  tne  job  groups  among  the  four  subeanple  populations  to  determine 
the  degree  of  similarity  between  job  groups  in  different  survey  samples. 
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If  similar  job  structures  emerge  in  the  cluster  diagrams  multiple  com¬ 
parisons  will  be  made  of  what  groups  are  not  represented  in  the  smaller 
samples  as  well  as  some  judgement  of  what  information  might  be  made  less 
evident  by  moving  toward  a  smaller  sample. 

The  Air  Traffic  Control  Radar  Repair  Career  Ladder  (AFSC  303X1)  Survey  ^ 
was  selected  as  &  data  file  because  it  has  just  recently  been  surveyed 
and  appears  to  bo  a  relatively  stabile  career  field.  The  useable  survey 
returns  included  1,111  cases  and  represent  of  the  career  ladder.  Twenty 
different  job  groups  were  defined  in  the  last  survey  report  indicating  a 
fairly  heterogeneous  career  ladder.  The  reported  group  sizes  from  this  OSR 
will  be  used  to  measure  the  representativeness  of  the  four  random  sample 
groups. 

Discuss ion 

Tho  analysis  of  the  four  studies  will  be  carried  out  by  one  occupational 
analyst.  As  an  initial  study  much  of  the  procedure  will  involve  relative 
comparison  for  the  heuristic  valuo  of  determining  if  some  pattern  or  trend 
in  tho  data  might  exist.  Squadron  Leader  William  J,  Watson  (RAAF)  com¬ 
pleted  a  similar  type  of  study  while  working  at  AFKRL  in  19?9 .  ^  His  study 
used  a  sample  of  1,9H3  cases  drawn  at  random  from  5*5^  cases  collected. 

This  analysis  was  compared  to  an  analysis  of  a  second  set  of  data  drawn 
randomly  from  the  remaining  cases.  All  the  job  type3  identified 

in  the  first  analysis  wore  also  identified  in  the  second  with  the  exception 
of  six  small  job  groups  accounting  for  /3  cases.  This  study  demonstra¬ 
ted  that  equal  samples  drawn  from  the  same  population  without  substitution 
would  yield  a  roughly  similar  analysis,.  By  taking  this  subsampling  ap¬ 
proach  to  a  smaller  sample  size  and  sampling  with  replacement  after  each 


1194 


percentage  drav  it  is  hoped  that  the  sensitivity  of  the  clustering  tech¬ 
nique  with  small  samples  can  be  assessed.  Considering  this  previous  study, 
it  might  be  expected  that  some  job  groups  might  be  "missing"  in  any  one 
of  the  samples  without  being  interpreted  as  being  less  accurate  or  repre¬ 
sentative.  The  question  of  whether  or  not  there  would  be  a  "significant" 
difference  would  depend  on  whether  or  not  an  A*r  Force  manager  might  come 
to  identical  conclusions  based  on  data  from  the  various  samples.  Any  set 
of  decision  criteria  later  to  be  developed  from  the  initial  findings  of 
this  study  would  require  consideration  of  the  realm  of  data  utilization. 
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AN  INVESTIGATION  OF  FIVE  ALTERNATIVE  INSTRUCTIONAL  STRATEGIES 


Charles  W.  Howard,  Ph.D 


INTRODUCTION 


The  field  of  instructional  technology  includes  research  and 
development  in  the  areas  of  programmed  instruction,  audio  visual 
equipment  and  digital  computers.  Each  area  has  developed  instructional 
materials  or  software  to  improve  the  quality  of  instruction.  One 
important  problem  is  how  to  modify  existing  instructional  programs  which 
have  not  been  developed  utilizing  a  systems  approach.  In  addition  to 
Insuring  post-instructional  competence  of  the  trainee.  Instructional 
strategies  must  also  be  directed  toward  the  transfer  of  learned  skills 
to  applications  of  those  skills.  A  fundamental  problem  shared  by  in¬ 
structional  technologists  is  modifying  or  developing  materials  from 
existing  content  materials.  Each  area  has  developed  Instructional 
materials  specified  by  objectives  and  evaluated  by  objective  based  tests 
to  Insure  the  quality  of  Instruction.  Instructional  technologists 
utilize  the  Instructional  Systems  Development  (ISP  approach  for  analyz¬ 
ing  the  performances  to  be  learned  and  designing  instructional  systems 
to  Insure  that  the  desired  Instructional  outcomes  are  realized. 

The  Department  of  Army  has  begun  converting  its  training  program 
into  self- Instructional  sets  of  materials.  Numerous  programs  are 
currently  self-instructional,  however,  many  instructor-taught  programs 
are  still  used  and  in  need  of  adaptation.  James  E.  Briggs  (1964), 

Smith  (1966)  identified  two  potential  problems  in  converting  existing 
lecture/demonstration  course  materials  into  self-instructional  pro¬ 
grams  which  include  1)  time  to  produce  effective  sets  of  materials;  and 
2)  available  progranroers .  Because  of  time  constraints  and  a  lack  of 
available  programmers,  alternate  strategies  for  converting  typical 
course  materials  need  to  be  analyzed. 

With  this  mission,  it  is  necessary  to  examine  (a)  the  effects  of 
alternative  strategies  for  Instructional  development  of  self-instruc¬ 
tional  materials,  and  (b)  whether  instructional  developers  can  i.iodlfy 
existing  software  Into  self-instructional  programs  which  promote  learn¬ 
ing  by  employing  auto-elucidation  techniques. 
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The  primary  purpose  of  this  study  was  to  determine  the  efficiency 
of  five  types  of  instructional  development  strategies.  Efficiency 
was  Investigated  in  terms  of  achievement  and  time.  The  secondary 
purposes  were:  1)  to  investigate  the  utility  of  informational  feedback 
for  each  instructional  development  strategy,  2)  to  investigate  the 
utility  of  adjunct  programming  in  the  form  of  post-text  questions  and 
remedial  branching. 

The  five  instructional  strategies  that  were  explored  in  this  study 

were: 

1.  instructional  text; 

2.  Instructional  text  supplemented  by  post-text  questions; 

3.  instructional  text  supplemented  by  post-text  questions  along 
with  knowledge  of  results  of  the  post-text  questions; 

4.  instructional  text  supplemented  by  post-text  questions,  know¬ 
ledge  of  post-text  questions  and  directions  for  remedial  instruction; 

5.  programmed  instruction,  developed  using  the  Identical  content 
of  the  Instructional  text  and  containing  linear  and  branching  pro¬ 
gramming. 

The  independent  variable  in  this  study  was  Instructional  strategy. 
The  dependent  variables  In  this  study  were  adjusted  post-test  scores, 
post-test  time,  total  package  time  and  total  teaching  time. 


METHODOLOGY 


Sample 

The  population  from  which  the  sample  was  drawn  consisted  of 
persons  serving  as  enlisted  or  conmissloned  engineering  personnel  for 
the  United  States  Army.  This  study  was  conducted  at  Fort  Belvoir  Army 
Engineering  School,  Belvoir,  Va. 


The  sample  consisted  of  720  subjects  who  were  assigned  to  various 
technical  training  programs  by  the  Department  of  the  Army.  The  subjects 
were  randomly  assigned  to  one  of  the  six  groups,  l.e. ,  five  treatment 
groups  and  one  control  group.  This  random  assignment  yielded  120 
subjects  per  group. 

Pre-test  scores  were  used  to  determine  the  level  of  mastery  of 
each  subject.  Each  subject  was  classified  as  a  non-master  and  there¬ 
fore  admissable  as  a  subject  for  this  study. 


Treatment 


The  five  Instructional  stragegles  consisted  of  content  material, 
questions,  knowledge  of  results  and  remediation  appropriate  to  the 
strategy. 

The  primary  component  was  the  content  material.  The  content 
material  used  for  this  study  described  the  identification  and  functions 
of  components  for  a  low-voltage  circuit  tester. 

The  next  component  of  the  instructional  strategy  was  the  Issuance 
of  statements  or  questions.  The  questions  were  derived  from  the  content 
material  and  were  designed  to  facilitate  learning  for  knowledge  and 
comprehension  of  the  content  material. 

Knowledge  of  results  (KR)  was  the  next  component.  This  component 
consisted  of  subject  matter  experts  responses  to  the  questions.  For 
the  purposes  of  the  study,  distinctions  between  instructional  strategies 
were  made  by  providing  KR  to  the  learner  in  three  of  the  five  Instruc¬ 
tional  strategies.  The  Inclusion  of  KR  was  a  form  of  feedback,  however 
the  implied  use  of  KR  was  limited  to  informational  feedback.  The  KR 
component  was  designed  to  provide  the  learner  with  the  model  performance 
expected  and  therefore  enable  the  learner  to  make  a  comparison  of  the 
model  performance  and  his  actual  performance. 

The  final  component  of  the  instructional  strategy  was  remediation. 
This  component  consisted  of  a  written  statement  directing  the  subject 
to  appropriate  portions  of  the  content  material  when  actual  performance 
made  by  the  subject  was  not  in  agreement  with  the  model  performance. 
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Design 

The  research  design  proposed  to  test  the  hypotheses  for  this  study 
consisted  of  an  analysis  of  variance  model  and  an  analysis  of  co- 
variance  model.  The  ANOVA  model  was  designed  to  separately  test  the 
dependent  variables:  post-test  time,  total  package  time,  and  teaching 
time.  The  ANCOVA  model  wac„  designed  to  test  the  dependent  variable 
post-test  score  adjusted  for  the  influence  of  the  pre-test  score.  The 
independent  variable  was  instructional  strategy. 

The  F  ratios  resulting  from  the  models  for  each  main  effect  were 
tested  with  the  probability  of  falsely  rejecting  a  hypothesis  set  at 
5%.  The  procedures  employed  to  test  pairwise  comparisons  of  means  for 
significant  F  ratios  were  Student  Newman  Keul's  and  Duncan's  simultan¬ 
eous  significant  post-hoc  tests. 


Procedure 


The  administration  of  each  of  the  five  instructional  strategies 
followed  a  common  procedure  consulting  of  four  phases. 


Phase  I 

Distributed  a  set  of  instructional  objectives  to  each  subject  of 
each  treatment  group,  as  well  as  the  subjects  in  the  control  group. 


Phase  IT 

The  subjects  were  given  the  package  pretest  for  a  maximum  period 
of  30  minutes.  The  subjects’  length  of  time  for  testing  was  recorded 
on  his  individual  test  by  the  monitor  for  analysis  in  tais  study. 


Phase  III 

Subjects  were  given  a  set  of  Instructional  materials  appropriate 
to  their  randomly  assigned  treatment  group.  The  maximum  length  of 
instructional  time  was  50  minutes  for  each  group.  When  a  subject 
completed  the  materials  he  returned  the  materials  to  the  monitor.  The 
monitor  recorded  the  amount  of  time  for  each  subject. 
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Phase  IV 


Immediately  following  the  treatment  the  subjects  were  given  a 
post-test  on  the  materials.  The  maximum  length  of  time  for  the  post 
test  was  30  minutes.  The  same  procedure  for  recording  the  subjects' 
time  for  taking  the  post-test  was  used  as  was  used  in  the  pre-test 
setting. 


Results 


The  means,  standard  deviations,  and  Ns  for  the  five  Instructional 
strategies  and  the  control  group  on  the  dependent  variables  total 
package  time,  post-test  time,  teaching  time,  pre-test  score,  post-test 
score  and  adjusted  post-test  score  are  reported  in  Table  1. 

The  dependent  variables  total  package  time,  post-test  time  and 
teaching  time  were  analyzed  by  an  analysis  of  variance  model  (ANOVA). 

The  results  of  these  analyses  indicated  a  significant  difference 
existed  between  the  groups;  total  package  time,  F[.95]  (4,  595)  = 

59,  457,  p  <  .05;  post-test  time,  Fl .95]  (5,  719)  *  20.68,  p<[  .05; 
teaching  time,  F[.95]  (4,  595)  =  67.51,  p<  .05.  The  pairwise  compari¬ 
sons  were  analyzed  for  each  of  these  dependents  using  the  Student 
Newman  Keul's  (SNK)  post-hoc  simultaneous  significant  test.  The  results 
of  the  SNK  tests  are  condensed  and  reported  in  Table  2.  The  results 
indicate  the  text  and  question  group  was  not  significantly  higher  than 
the  programmed  instruction  group  in  the  dependent  variables  post-test 
time,  teaching  time  or  total  package  time.  Thus,  an  Instructional 
strategy  has  been  identified  that  is  as  efficient  as  programmed 
instruction. 

Evaluation  of  the  effectiveness  of  the  instructional  strategies 
was  measured  in  terms  of  performance. 

The  dependent  variables  pre-test  and  post-test  performance  scores 
were  analyzed  by  an  analysis  of  co-variance  model  (ANC0VA).  The  result 
of  this  analysis  indicated  a  significant  difference  existed  between  the 
adjusted  post-test  performance  scores  reported  in  Table  1  with  an 
F[95]  (5,  713)  =  53,  59,  p<  .05,  A  test  of  homogenlty  of  regression 
was  performed  and  no  violation  of  the  assumption  was  observed.  The 
pairwise  comparisons  were  analyzed  using  the  Duncan  post-hoc  simultan¬ 
eous  significant  test.  The  results  are  shown  in  Table  3  and  Indicate 
the  test  plus  question  group  was  not  significantly  different  than  the 
programmed  instruction  group.  Thus,  an  instructional  strategy  has  been 
identified  that  is  as  effective  as  programmed  Instruction. 
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The  actual  formation  of  the  five  instructional  strategies  con¬ 
sisted  of  incorporating  the  four  components:  content,  questions,  KR, 
and  remediation  into  the  instructional  development  strategies  as 
described  below. 

1)  Programmed  Instruction:  Subject  was  given  a  self-instructional 
set  of  materials  that  had  been  developed  to  move  the  subject  through 
the  material. 

2)  Text:  Subject  was  given  a  manual  that  contains  the  content 
material  with  no  programming  of  the  content  material. 

3)  Text/Questions:  Subject  was  given  a  manual  that  contains  the 
content  material  plus  a  set  of  questions  which  were  at  the  end  of  the 
material.  The  directions  preceding  the  questions  instructed  the  learn¬ 
er  to  overtly  respond  to  the  questions. 

4)  Text/Questions/Knowledge  of  Results:  Subject  was  given  a 
manual  and  a  set  of  questions  as  above.  This  instructional  strategy 
was  designed  to  include  the  correct  responses  for  the  questions  in 
addition  to  text/questions. 

5)  Text/Questions/KR/'Remediation:  Subject  was  given  a  text  con¬ 
taining  the  content  material,  questions,  KR,  and  directions  to  portions 
of  the  text  to  review  non-mastered  content  material. 

6)  Control:  Subject  was  pretested  and  post-tested  with  no 
instructional  treatment. 


Instruments 

The  pre-test  and  post-test  consisted  of  27  matching  items.  The 
format  was  to  provide  each  subject  with  a  list  of  statements  and  a 
graphic  representation  of  the  low-voltage  circuit  tester.  The  graphic 
included  the  components  of  the  low-voltage  circuit  tester  pictured 
with  numerics.  The  objective  was  to  match  the  statement  with  the 
numeric  in  the  graphic. 


Discussion 


The  findings  suggest  an  alternative  instructional  strategy  has 
been  developed  that  is  as  effective  and  efficient  as  programmed 
instruction.  Programmed  instruction  results  from  a  rigorous  applica¬ 
tion  of  the  ISD  procedures  and  is  also  extremely  time-consuming  and 
therefore  expensive.  The  instructional  strategy  which  has  been  found 
to  be  as  effective  and  efficient  as  programmed  instruction  uses  the 
principles  of  mathemagenic  beliavicr  and  auto-elucidation  techniques. 
The  application  of  mathemagenic  activities  and  auto-elucidation 
techniques  have  cost  effective  implications  for  the  conversion  of 
Instructional  materials  to  self-lnstructlor.al  materials. 


Conclusions 


This  research  has  examined  alternative  Instructional  development 
strategies  and  auto-elucidation  techniques,  e.g.,  effects  of  KCR, 
remediation,  and  post-text  questions.  This  investigation  has  focused 
on  the  effectiveness  of  five  types  of  instructions i  strategies. 

The  results  support  the  aoplicatlon  of  auto-elucidation  techniques 
in  the  form  of  post-text  questioning.  Text  or  written  material 
followed  by  post-text  questioning  resulted  in  performance  as  effective 
as  programmed  Instruction.  Post-text  questions  are  therefore  regarded 
as  aids  to  the  learner  which  stimulate  the  learning  environment. 

Post-text  questions  supplementing  but  following  the  reading  of 
written  prose  or  text  have  been  demonstrated  as  an  effective  mixture  of 
instructional  materials.  For  example,  the  treatments  text,  text  plus 
questions  plus  feedback  or  text  plus  questions  plus  feedback  plus 
remediation  when  uncontrolled  was  found  to  be  less  effective  than  P.I. 
or  the  use  of  text  plus  post-text  questions. 

In  summary,  the  results  support,  the  following  conclusions:  1)  the 
use  of  post-text  questions  in  the  form  of  text  and  post-text  questions 
was  demonstrated  to  be  the  most  effective  alternative  strategy;  2)  the 
application  of  auto-elucidation  components  in  the  form  of  knowledge  of 
results  and  remediation  when  uncontrolled  was  less  effective  than  the 
use  of  post-text  questions  only. 
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IMPLICATIONS  AND  RECOMMENDATIONS 


The  primary  purpose  of  this  Investigation  was  to  identify  the 
efficiency  of  five  types  of  instructional  development  strategies. 

This  study  investigated  the  effects  of  applying  auto-elucidation 
techniques  which  were  developed  by  Sidney  Pressey.  E.Z.  Rothkopf  and 
L.  Frase  further  studied  the  process  of  Incorporating  questions  into 
written  prose.  The  implications  are  that  auto-elucidation  techniques 
In  the  form  of  post-text  questions  create  a  learning  situation  that  is 
both  meaningful  to  the  learner  and  desirable  to  the  instructor  and  the 
institution.  The  meaningful  ness  of  the  learning  situation  has  been 
described  In  the  literature  by  E.Z.  Rothkopf  and  L.  Frase  as  math- 
emagenic  behavior.  The  use  of  post-text  questions  requiring  overt 
response  without  any  form  of  feedback  produced  the  most  desirable 
results  both  in  terms  of  time  and  performance  for  the  alternative 
instructional  strategies. 

This  Implication  has  major  Impact  in  the  psychological  area  of 
feedback  for  curriculum  developers,  instructional  technolgists  and  the 
field  of  psychology.  Arguments  against  this  implication  would  generally 
Indicate  that  treatment  groups  receiving  text  +  questions  and  feedback 
would  presumably  obtain  higher  scores.  However,  as  demonstrated  within 
this  study,  this  was  not  the  case.  Therefore,  instructional  materials 
for  which  treatment  groups  do  not  have  access  to  the  correct  answers 
but  are  permitted  to  review  the  instructional  materials,  are  both 
effective  and  efficient. 

Mathemagenic  behavior  as  described  in  the  literature  and  discussed 
In  this  study  can  be  present  in  the  absence  of  informational  feedback. 

The  instructional  strategies  investigated  have  incorporated  the 
piinciples  of  systems  analysis,  informational  feedback  and  auto¬ 
elucidation  processes. 

Instructional  development  models  used  in  this  study  yielded 
instructional  materials  that  were  efficient  in  terms  of  reducing 
teaching  time  and  effective  in  terms  of  performance,  Therefore,  the 
conversion  of  existing  instructional  materials  into  self-instructional 
modules  using  mathemagenic  principles  can  have  cost-benefit  advantages 
for  educational,  industrial  and  Armed  Service  institutions. 
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The  implications  for  cost-effective  instructional  materials 
development  are  as  follows: 

1)  Material  may  be  developed  that  yield  performance  scores 
higher  than  existing  ISO  developed  materials  while  requiring  less 
teaching  time; 

2)  The  cost  associated  with  development  of  the  materials  may  be 
significantly  less. 

The  next  implication  concerns  the  Armed  Services.  The  Armed 
Services  have  two  plans  under  which  all  training  operates.  The  two 
plans  are  peacetime  and  wartime.  Without  discussing  either  of  the 
plans  in  detail,  the  fact  remains  that  during  wartime  the  Armed 
Services  must  reduce  the  training  time  of  soldiers  in  order  to  meet 
the  demands.  Therefore,  considerations  must  be  made  regarding  the 
instructional  materials.  Results  of  this  study  Indicate  that  the 
difference  in  the  means  for  text  (2)  and  question  (3)  groups  total 
teaching  time  was  6.76  minutes  which  represents  one-third  less  time. 

The  decrease  in  performance  was  3.37  test  items  or  12%  of  the  total 
possible  score.  Without  discussion  about  the  future  effectiveness  of 
the  soldier,  the  point  to  be  considered  is  that  alternative  Instruction¬ 
al  strategies  may  cut  training  time  by  1/3,  therefore  better  meeting 
the  wartime  needs  of  our  country. 
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PAIRWISE  COMPARISON  OF  MEAN  DIFFERENCES  OF 
THE  SIX  GROUPS  ON  ADJUSTED  POST-TEST  SCORE 
AND  THE  CONTROL  GROUP 
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Course  Quality  Measurement  of  CAI 
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The  IBS  Field  Engineering  Division  uses  a  CAI/CBI  systea 
with  approxiaately  400  terainals  in  branch  offices  across 
the  country  to  accoaplish  technical  training  cf  xBR's 
custoaex  service  personnel.  The  systea  currently 
adainisters  374  different  courses  and  averaged  28,000 
student  hours  per  aonth  in  1976.  Hany  nee  courses  are 
developed  each  year  to  heap  pace  vith  advances  in  product 
and  service  technology. 

This  CAI/CHI  systea  is  called  the  Field  Instruction 
Systea  >  Version  II  (FIS  II).  It  uses  a  aodification  of  an 
IBfl  progressed  product,  the  Interactive  Instruction  Systea 
(IIS)  as  the  teaching  vehicle.  Another  IBM  progressed 
product.  Interactive  Query  and  Report  Processor  (IQRP)  is 
utilised  for  the  course  quality  analysis  part  of  IIS  II. 
The  application  'it  these  two  progressed  products  (IIS  and 
IQIP)  is  how  FIS  XT  accosplishes  "course  quality  aeasureaent 
of  CAI" . 

FIS  II  utilizes  an  internal  cospany  teleprocessing 
aetwork  to  connect  aore  than  160  branch  offices  to  a  central 
coaputer  in  r«w  York. 

Course  quality  is  a  aajor  concern  in  this  systea  because 
of  the  large  nuafcer  of  new  courses  released  each  year  and 
because  of  the  decentralized  structure  of  the  course 
developaent  organization.  Fifty-one  new  courses,  consisting 
of  140  units  that  average  five  hours  each,  were  released  in 
1976.  They  were  developed  by  twelve  geographically  separate 
departaents. 

The  purpose  of  this  publication  is  to  describe  a  course 
quality  aeasureaent  systea.  It  is  necessary  to  establish  a 
proper  contest  for  this  description,  therefore  the  first  few 
peges  will  be  used  to  establish  an  overall  picture  of  FIS  II 
end  the  functions  which  support  the  course  quality 
aeasureaent  part  of  the  systea. 
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Th«  Field  Engineering  Division  became  involved  in  CAI  in 
the  early  1960*s  when  "t-jok  type"  programmed  instruction 
•aterial  vas  adapted  to  CAI  typewriter  terminal*.  This  was 
an  experimental  era.  Several  Chi  systeas  were  tried  and 
evaluated  and  research  vas  conducted  to  identify  the  nest 
effective  types  of  CAI  materials  and  learning 
■ethods/eodels.  In  1968  a  sophisticated  full  scale 
teleprocessing  CAI  systen  vas  isplenented. 

Backed  by  seven  years  experience,  a  redesigned  CAI  systea 
vas  isplesentad  in  1975.  The  IDR  Interactive  Instructional 
Systea  vas  selected  as  the  base  for  the  redesigned  systen. 
IBR  3270  information  Display  Systea  cathode  ray  tube 
tersinals  replaced  the  typevciter  terminals  end  a  nev 
improved  course  model  vas  developed.  The  nev  model 
incorporates  current  innovations  in  training  psychology  and 
the  experience  gained  from  4,000,000  student  hours  executed 
os  the  previous  system.  This  nev  CAI  systen  vas  named  Field 
Instruction  Systea  *  Version  II  (FIS  II). 


flS  II  course  structure  follows  a  typical  CAI  aodel 
isploying  aulti-level  coaponcnts,  pre/post  testing,  remedial 
and  unit  (criteria)  tasting.  A  brief  description  of  each 
component  follows  (Figure  1): 

CflUH:  Average  course  length  is  10  hours,  ranging  frea  1  to 
80  hours.  Satisfactory  ccapletion  of  a  course  certifies 
that  the  individual  can  perfora  certain  job  tasks  when 
servicing  custoaer  equipaent.  Soae  PIS  II  courses  are 
prerequisite  to  Education  Center  laboratory  courses  where 
actual  "hands  on"  experience  is  gained. 

JlUi:  This  is  a  logical  sub-coaponent  of  the  course's 
subject.  Average  length  is  5  hours.  During  Unit  Evaluation 
at  the  end  of  each  unit  the  students  opinion  is  solicited 
via  a  Student  Opinion  Questionnaire  an  a  Onit  Test  is 
adeinistered.  Course  quality  aeaaureaent  data  is  stored  "by 
unit"  resulting  in  the  capability  to  analyse  the  quality  of 
each  unit. 

iauifiji :  *  session  is  a  logical  grouping  cf  objectives. 

Each  objective  has  a  related: 

-  Test  itea 

-  Assignment  (teaching  saterial) 

-  Beaedial  aaterial 

Correctly  answering  a  Pretest  Itea  branches  the  student 
past  the  r el  feted  Assignaent  and  past  any  further  testing 
(Post  Test  and  Onit  Test)  . 

For  each  Assignaent  presented  to  the  student  a  Post  Test 
Ites  is  adain istered.  Resedial  aaterial  is  adainistered  for 
Post  test  Iteas  not  ansvered  correctly. 

Isfianisal :  The  Assignaent  teaches  a  single  objective  and  is 
coaposed  of  Activities,  each  relating  to  a  separate  student 
learning  task  such  as: 

-  Beading  text  (on  the  screen  or  referenced  in  another 
aediua) 


Answering  study  guestions 

Working  on  application  type  exercises 

Solving  a  ?robles 

The  Assignment  and  Activity  levels  are  whore  the 
teaching4*  occurs. 
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SflltStt  Jt<ga§  are  used  for  the  various  tests  in  each 

unit.  Typically,  two  alternate-equivalent  iteas  are 
developed  for  each  objective  in  the  unit.  One  of  the  itess 
is  randoaly  selected  and  adainistered  ,  « a  appropriate,  in 
the  Session  Pretest,  Session  Post  Test  and  Unit  Test.  Using 
cosaon  iteas  reduces  the  authors  test  developaent  effort  and 
helps  ensure  that  each  test  is  valid  to  it*s  objectives. 

Ifi&l  21SES  are  used  because  the  student  say  be 
tested  several  tiaes  on  the  sane  objective.  Adainistering 
an  alternate-equivalent  itea  during  the  re-testing  of  an 
objective  accomplishes  two  things: 

-  Reduces  student  feelings  of  "continually  seeing  the  saae 
questions". 

-  Reduces  the  possibility  of  the  student  correctly 
answering  a  re-test  itea  froa  inforaaticn  teapcrarily 
aeaorixed  froa  previous  testing. 


Itli  llfil  I  B«ch  test  itea  neasures  a  single 

objective.  A  test  itea  can  consist  of  any  nuaber  and  type 
of  questions,  exercises,  probleas,  etc.  This  ensures  that 
an  itea  can  coaprehensively  tea  s  ure  it's  objective. 

t!S  itslSl  tssiina  reduces  tho  total  nuaber  of  test  iteas 
adainistered  to  the  student  while  aaintaining  coapr ehensive 
testing  capability.  For  exaaple,  a  terainal  objective  has 
two* enabling  objectives.  When  learning  the  three  objectives 
the  student  is  first  given  teaching  aaterial  for  the  two 
enabling  objectives.  Then,  the  teaching  aaterial  for  the 
terainal  object iva  is  presented.  However,  during  testing 
the  student  is  given  the  terainal  teBt  itea  first  and  if  it 
is  answered  correctly  the  two  enabling  test  iteas  are 
skipped. 


1214 


Common  Tett  Iterro 
for  Seuion 


E 

t: 

| 

i 

B 

1  ^  Terminal  (T)  Test  Items  j 

J  Normally  Ultd  to  j 

D 

D 

D 

i  Make-Up  Unit  Ton  5 

■ 

■ 

B 

o 

■ 

D 

1  | 

1 

■ 

B 

B 

8 

B 

) 

r 

Alignment 

Summary 

mill 

6 

of 

Atmt't 

ill 

JL 

h9B|] 

II 

mm 

1 

eg 

1 

1 

1 

1 

; 

■ 

inn 

1 

■89 

The  r»am|i(r  thrnvi  dir  rontrnt  ol  a  typical  trumn  thu  tru-i.-i  rl  tuwd 
on  tm  obteclivet.  infliid'or,  both  Torrnin.ii  i  T  I  and  ErulHitit;  (El  0b|ectivft 
and  related  test  Urmi  The  termini,  tett  rtrrm  ha*r  "a"  and  "b"  alternate 
vertront 


FIGURE  2 


1215 


SiUilSBi  Opinion  Questionpajpa  solicites  the  students 
opinion  of  the  course,  the  system  (FIS  II)  and  of  the 
local  branch  office  study  environment.  It  is  administered  at 
the  end  of  each  unit  and  uses  a  three  level  branching  logic 
which  is  designed  to  (Figure  3): 

-  Continue  to  solicit  opinion  details  if  the  studect*s 
responses  are  significantly  positive  or  negative. 

-  Stop  soliciting  opinion  details  when  the  student  stops 
responding. 

At  the  first  level  every  student  is  asked  for  an  overall 
opinion  of  the  unit  and  aust  respond  to  a  six  choice  range: 

-  Very  Good 

-  Good 

-  Average 

-  Poor 

-  Very  Poor 

-  Mo  Opinion 

A  response  of  Average  or  Mo  Opinion  ends  the  questionnaire. 
A  response  of  Very  Good,  Good,  Poor,  or  Very  Poor  taxes  the 
student  to  level  two  of  the  questionnaire  where  a  single 
screen  containing  fifteen  selectable  iteas  is  displayed.  If 
none  of  the  iteas  are  selected  the  questionnaire  ends. 
However,  for  every  itea  that  is  selected  a  level  three 
screen  is  displayed  containing  additional  detailed  iteas. 

This  branching  type  questionnaire  ainiaizes  the  aaount  of 
student  tiae  spent  in  the  questionnaire  and  reduces  the 
collection  and  analysis  of  "null"  responses.  The  aaxiaua 
auaher  of  of  responses  the  student  could  give,  including  all 
iteas  in  all  three  levels  is  seventy-five.  The  ainieua 
number  is  one,  a  null  response  to  the  level  one  question. 

5i24imx  Stamina:  The  unit  test,  and  all  teaching  material, 
can  be  re-taken  as  aany  tiaes  as  necessary  to  satisfactorily 
complete  the  test.  A  satisfactory  Course  Completion 
Certificate  is  given  by  the  system  when  all  unit  tests  have 
been  satisfactorily  coapleted. 
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OV  EE  ALL  QUESTION 
(6  BESPONSE 
CHOICES) 


PRIHAST  SCREEN 
(IS  CHOICES) 


SECONEAB*  SCREENS 
(OP  TO  15  SCREENS, 
ONE  PFS  f  1CH  PRI¬ 
MARY  CHOICE) 


VERT  GOOD - >  •  • 

•  • 

COCO  - - >  *  * 


AVERAGE  - . ♦ 

I  ••■>•••••••• 

POOR  . . >  •  SEE  * 

|  •  EXAHPLE  * 

VERT  POOS - >  •  BELOB  • 

I  . . * 


BO  OPINION  ---♦ 

I 

I 

EXIT 

QUESTI ORXAIRS 


•  •• 

>  •  •  *• 

•  •  *  * 

*••••••••••  •  * 


•  SEE  •• 

>  •  EXAHPLE  •  •• 

•  BELOB  •  *  • 

•*•*•»•*•••  •  • 


EXAHPLE  OP  PR  I  HART  SCREEN  CHOICES 

Tou  indicated,  "the  unit  is  poofi/VEHT  POOR",  vbat  are 
your  reasons? 

1.  COVERAGE  of  tho  subject  is  ir-onij  (too  little/too  »uch) 

2.  HATER  1A  L  presented  is  OIEKICULT  to  UNDERSTAND 
).  HURON  and  use  of  !  NPOSHAL  language  is  poor 


•  •  e  •  e 

15.  SUBJECT  being  taught  is  NOT  INTERESTING 
I.  REASON  in  NCI  SHORN  abc.i. 


EXAHPLE  or  CHOICES  ON  A  SECONDART  SCREEN 

lou  indicated,  "CO* I. RAGE  of  the  subject  is  urong",  chat 

are  your  r«aaon»? 

1.  TOO  LITTLE  is  covered  -  I  don't  feel  confident 

2.  TOO  HOC  It  is  covered  *  unnecessary  cater  ial  is  presented 
I.  REASON  is  NOT  SHORN  abore 


flGURE  1 
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Qgncept:  Uhen  a  net*  course  is  released,  how  cars  you  tell  if 
it's  a  "good"  course  or  if  it's  a  "bad"  course?  FIS  II  uses 
the  tera  Course  Quality  to  describe  how  good  cr  bad  a  course 
is.  Course  quality  consists  of  three  coaponents  (Figure  4): 

-  Effectiveness  (ability  of  student  to  perfors  as  stated 
in  objectives) 

-  Efficiency  (ability  of  studert  to  learn  in  ainiaal  tiae) 

Acceptance  (student  opinion  of  the  training  experience) 

There  is  a  considerable  aaount  of  interrelationship 
between  the  three  coaponents.  For  exaaple,  a  student  who 
"likes"  the  training  experience  way  learn  "faster".  The 
coaponents  in  this  systea  constitute  a  hierarchy  of 
iaportance  and  if  the  systea  was  Halted  to  aeasuring  only 
one  coaponent  it  would  be  effectiveness. 

31JIU:  The  systea  is  designed  for  these  users: 

Author  (staff  personnel  who  develops  the  CAI  course  or 
course  unit) 

*  Point  of  Control  (local  aanageaent  and  senior  staff 
personnel  who  have  on-going  responsibility  for  a  group 
of  courses) 

-  Headquarters  (upper  aanageaent  -  responsible  for  all 
co nrses) 


Resea rch 


COMPONENTS  Of  CAI  COORSE  OVALITY 


EFFECTIVENESS 

(PEEFOuXANCE) 


t 

|  EPFICI ENCT 

I  ( TIME) 


ACCEPTANCE  I 
(OPINION)  | 


FIGURE  1 


1219 


2i$££  The  systea  is  a  "deaand  only"  systea.  Bach 
report  aust  be  requested.  Be  ports  do  not  have  a  scheduled 
printing  and  distribution  cycle.  This  keeps  paperwork  to  a 
•iniaua  and  kelps  ensure  the  course  quality  inforaation  ends 
up  only  in  the  hands  of  those  who  want  it,  when  they  want 
it. 


All  reports  are  displayable/printable  on-line  at  any  of 
the  systeas  CAI  terainals.  Special  requests  for  analysis  of 
unique  coabinations  of  data  can  be  foraulated  and  the  report 
can  be  obtained  on-line  in  ainutes.  This  is  especially 
useful  when  seeking  the  cause  of  a  "low"  quality  problea 
where  repeated  analysis  is  required. 

flgtg :  The  sy&tea  is  used  easy  ways  by  each  of  the  users. 
Prirary  uses  include: 

-  Reinforce  author. 

-  Identify  low  quality  training  requiting  revision. 

-  Identify  quality  training  aethoda  and  techniques  for 
propagation. 

-  Allow  eanageaent  to  direct  revision  resources  to  the 
lowest  quality  course 
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Oaring  design  of  the  PIS  XI  eeasuresent  systea  existing 
progrnas  were  exaained,  for  suitability,  ia  an  atteapt  to 
^liainate  the  developaent  of  a  new  prograa.  The  Field 
Engineering  Hanageaent  Inforaation  Systea  (P5/BIS),  an 
inte/nai  IBB  prograa  net  the  design  specifications  and  was 
selected.  This  prograa  is  now  available  outside  IBB  as  a 
prograaaed  product  called  Interactive  Query  and  Report 
Processor  (IQIP). 

The  IQRP  features  that  are  aost  applicable  to  the  PIS  II 
Course  Quality  aeasureaent  systea  include: 

-  Record  selection  flexibility 

-  Calculation  capability 
Seguencing/suUo^aling  capability 

•  Output  fornat  flexibility 

"Bacro"  capability  for  "packaging"  often  used  reports 

The  tocord  obtained  froa  each  student  during 
the  completion  of  each  course  unit  is  290  characters  long 
and  includes  the  following  course  quality  data: 

IDBKTJPXCATION  DATA 

-  Course  and  unit  number 

-  Baployee  and  branch  office  nusber 

-  Date  if  unit  completion 

F-oint  of  Control  location  responsible  for  course 
TEST  DAI  A 


Buaber  of  test  tries 

Result  of  last  test  try  (satisfactory  or  unsatisfactory) 


-  Itea  results  (students  response  to  each  of  a  aaxiaja  of 
8t  Unit  Test  iteas) 

TIRE  DATA 

Student's  "on  terminal"  study  tiae 

-  Student's  "off  terminal"  study  tiae  (as  reported  by 
student) 

Student's  "on  terainal"  Unit  Test  tiaa 

5TUDRWT  OPINIOR  QOESTIORIIAI BE  DATA 

Student's  overall  rating  of  the  unit  (Very  good,  good, 
average,  poor, very  poor,  no  opinion) 

*  Student's  responses  to  detailed  opinion  itaax  (aaxiaua 
of  74  iteas) 

1&UU3  Onxy  the  SO  aost  current  records  (students) 

are  stored  for  each  course  unit.  When  t  record  for  a  nee 
student  completion  is  put  on  the  IQRP  file,  the  oldest  (51 
st)  record  for  that  unit  is  purged.  This  results  in  a 
reasonable  storage  "size*  requireaent  and  still  provides 
current  usable  data  for  all  users.  Currently  the  PIS  II 
IQRP  file  contains  over  9,000  records  and  includes  160  units 
froa  70  courses.  The  file  continues  to  9rou  as  new  courses 
are  released. 

&MliU9JLI  Z  iMiil S3:  IQBP  is  used  by  foraulating  questions 
about  the  data  and  analysis  eantad  (Figure  5).  These 
questions  ore  entered  at  a  terainal  in  an  abbreviated 
language  called  the  IQRP  inquiry  (exaaples  are  shown  later 
in  the  RESULTS  section). 


tli  KPLE  OF  QPE51I0SS  USED  IK  POSBULAT I  KG  I£1P  IKCUISfS 

»hat  is  the  student  accepts  nee  of  unit  12  of  Course  170033“*? 

Khat  are  reasons  that  cause  sow  students  to  indicate  that 
Onit  «2  of  Course  *70339  is  not  acceptable? 

Khat  is  the  average  student  acceptance  fet  Course  *70339 
1*11  units)  7 

Khat  is  the  average  student  acceptance  for  all  unite  in  s’l 
courses  developed  by  dopartsent  KYC? 

Do  student.-  who  rAte  units  as  "very  acceptable"  coaplete  tha 
units  fantac  or  slower  than  students  who  rate  units  as  "not 
acceptable" ? 

KOtE:  fhu  ptecedinq  questions  all  Involved  the  "acceptance" 
ooaponent  of  Course  Quality.  Sillier  questions  can  also  be 
asked  concerning  the  "effectiveness"  and  "efficiency" 
coa  ponenttt. 


"E1&&S  USE£ 


Pour  aethods  are  used  to  analyze  course  quality  (Figure 

6): 

-  IQRP  5us»ary 
IQHP  Detail 

•  Response  Recording  by  Unit 

•  Response  Recording  by  Student 

These  aethods  fora  a  hierarchy  of  "level  of  inforaation 
detail",  with  Response  Recording  being  the  aost  detailed. 
The  aethods  using  IQRr  yield  gene ralizable  results,  such  as 
percentages,  that  allow  coaparison  of  unit-tc-uni t, 
course- to-course,  and  Point  of  Control-to-Point  of  Control. 
The  reaainder  of  this  publication  will  concentrate  on  the 
IQHP  aethods, 

The  Response  Recording  analysis  aethods  are  not 
generalizablo.  Thoy  yield  results,  such  as  specific  student 
answers,  that  cannot  be  coapared  to  other  units,  courses,  or 
Points  of  Control.  These  analysis  aethods  are  a  built-in 
feature  of  IBS's  Interactive  Instruction  Systea  where  they 
are  called  Course  Activity  Suaaary  report  and  Response 
Recording  report. 

The  ability  to  coapare  the  quality  of  units,  courses,  and 
Points  of  Control  is  very  iaportant  to  the  Field  Engineering 
Division  because  of  the  decentralized  structure  cf  the 
departaents  that  develop  the  courses. 
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I  US >  TIBE/UKIT  COURSE  «  70339  SUBTl/LH  0*17  7/0. 
|  PG.1  03/18/77  •••  PIS2  ANALYSIS  REPORT  ••• 

|  PILE  UPDATED  OS  08/15/77 


|  COURSE  C  AUTH  PITS  UNIT/AYC  PCT/AYG 

(  0  LCC  TIBS 

|  70339  1  XYC  9.2  8.1  37 

|  RECORDS  50 

I  70339  i  xrc  2.1  2.2  106 

|  RECORDS  50 

I _ /I _ /I _ /I _ /I _ /I _ /I _ /I _ /I _ /I _ /I _ /I _ J 

IQSP  SUBBART  EXAMPLE:  Each  line  shoes  Unit  *su»»aryH  Oita  (50  students) 

for  effectiveness,  efficiency,  or  acceptance. 


|  PIS2  TIBE  COURSE  «  70339  VKIT  =>  1  LIBIT  55,  | 
|  PG.  1  08/18/77  e*e  f IS  2  AKALISIS  REPORT  •••  | 
|  PILE  UPDATED  OS  08/15/77  j 


COORS  E 

C 

AUTH 

SERIAL 

B/O 

cost'  DTD 

C  L 

PLAN 

TIBE 

TIBE 

PST 

LST  | 

u 

LDC 

P  . 

T1BE 

0* 

CPP 

TST 

TST  | 

» 

TRY 

T3I  | 

70  339 

t 

STC 

•7C27U 

836 

08/15/77 

S  1 

9.2 

10. c 

1.5 

.5 

•5  I 

70339 

1 

xr: 

•75235 

836 

OH/ 15/7 7 

S  1 

9.2 

6.  * 

.2 

.  2 

.2  1 

70339 

1 

STC 

37007b 

U  22 

08/13/77 

S  2 

9.2 

8.E 

.« 

.6 

.2  | 

_ /I... 

„/l _ 

/!„../! 

/I _ / 1  __ 

1 

1 

\ 

» 

1 

1 

i 

_ / 

_ /I 

_ /I 

I ORP  DETAIL  EXASPLE:  Each  line  sheen  Unit  "detail"  data  (<  student)  for 
effoctivonoon,  efficiency,  or  acceptance. 


|  RISPOXSE  RECORD  ISO 

OY  URST 

|  COURSE 

703*3 

UXIT 

-* 

i 

|  LABEL  STUD E XT S 

CA 

*A 

1 

1  6l?aci 

36 

0 

0 

(RESEDIAL  -  AFTER 

1ST  POST  TEST  TRT) 

|  6  1C*  11 

8 

1 

7 

(PRETEST  ITEfl) 

i  61E**: 

1C  5 

0 

9 

{ ABSTRACT  -  START 

CP  TEACH  BATESIAL) 

I  6  IB*  11 

5  f 

56 

26 

(POST  TEST  17  28  - 

PART  1) 

1 _ /)__ 

_„/l 

_ r  1 _ /i _ /I... 

./I _ /I - /I - 

RESPONSE 

record: 

EG  BY  Otl 

T  EXAEFLEi  Each  line 

nhovn  accuaulated 

r«*por,  ft  of. 

O'-cvir » 

*n«j 

at  a 

stnqlo  display  screen 

• 

|  Report  »oo  lete,lod  to  Illustrate  hero. 

I _ /I _ / i _ _ J\ _ /I _ /i _ /i _ / 

RESPOKSK  R  rcopoi  S';  si  T  .1UDERT:  Each  icport  shove  every  rejponso  aadv  tiy 
a  student. 


PIGUPE  b 


1225 


ami&s 


The  report  examples  used  in  this  section  are  condensed 
and  siaplified.  However  they  do  represent  actual  or  typical 
results  of  the  PIS  II  Course  Quality  Measurement  systea* 


8  EPOHT  EXAMPLES 

Ifili  S.HH3LX  Ss^agte;  The  top  lino  of  each  report  is  the 
XQBP  language  inquiry  that  was  keyed  in  at  a  terainal  and 
produced  the  on-line  report.  The  inquiry  for  the  first 
report  shown  translates  to  (Pigure  7): 

Asks  IQRP  to  look  at  PIS  II  data  only  (PIS2). 

-  Bequests  a  preforaatted  report  called  TES7/RES0LT. 

Asks  only  for  courses  within  liaits  (ML)  of  Course 
#70335  and  Course  #70350,  Only  two  of  the  courses  are 
shown. 

-  Asks  for  subtotals  by  unit  and  course  (SOBTL)  and  to 
arrange  courses  and  urut.s  in  sequence  frca  low-to-high 
<LH). 

-  Asks  for  totals  only  (T/0) ,  rather  than  a  report  line 
for  each  of  the  50  students  on  file. 

Mow  lets  oxaaine  the  report  data.  Pirst  the  ccluan  headings 
and  Line  1  of  the  data: 

COURSE  *  The  first  data  line  is  for  Course  *70333. 

'  CU  =  Data  on  this  first  line  is  for  course  Onit  *1. 

AOTH  LOC  =  Departaent  responsible  for  this  unit  is  NIC. 

PASS  PAIL  =  Criteria  in  %,  for  passing  Onit  Test,  as  set 
by  author. 

SAT  =  Muaber  of  students  who  coapleted  tho  Onit  Test 
satisfactorily  (50  of  50). 
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PI52  TEST/RSSCl?  C 

0U8SE 

WL  70335,70350  S0B7L/LH 

CO CRSS  ,  0X1  T  7/C. 

1 

PG.1  08/18/77 

•  •• 

PIS2  ANALYSIS 

REPORT 

1 

PILE  UPDAT £0 

OX  08/15/77 

1 

J 

COURSE  C 

AUTH 

PASS 

SAT 

OXSAT  T8T/1  ToY/2  T RE/3* 

SA1162/PCT 

SAT/ACJ 

1 

U 

ICC 

FAIL 

1 

703)9  1 

ETC 

70 

50 

50 

100 

70 

1 

1 

RECORDS 

DO 

i 

70339  2 

ETC 

50 

50 

47 

2  1 

98 

48 

i 

RECORDS 

50 

i 

70339  3 

NIC 

100 

50 

50 

'■00 

no 

! 

99 

73 

1 

RECORDS 

SO; 

150 

X 

1 

703#  3  1 

SEC 

70 

44 

1  42 

2 

98 

68 

1 

1 

RECORDS 

45 

1 

70  3#  3  2 

SEC 

70 

45 

30 

9  6 

87 

07 

1 

RECORDS 

45 

1 

70343  3 

SEC 

70 

45 

23 

9  12 

71 

41 

1 

\ 

85 

55 

1 

RECORDS 

45; 

115 

1 

_ /I _ /\ _ 

._/! 

- 

„/l _ /I _ /I _ /(_. 

__/i _ n 

_ /I _ 

_/ 

r IS2  tine/ox: t  course 

XL 

70335,70350 

SUUTL/LH  COURSE, OMIT 

T/O. 

1 

PC.  1  08/18/77 

•  •  • 

f IS 2  ANALYSIS 

REr  .81 

1 

MLS  UPDATED 

OX  08/15/77 

1 

COURSE  C 

At.'T  H 

Pt.AK 

UNIT/AEG 

PCT/AVG 

1 

LOC 

TINS 

1 

70339  1 

ETC 

9.  2 

8.1 

87 

I 

1 

RECORDS 

50 

1 

70339  2 

XTC 

2.  1 

2.2 

106 

( 

RECORDS 

50 

J 

70339  3 

xrc 

10.0 

14.1 

14  1 

1 

1 1  1 

1 

RECORDS 

50; 

iso 

1 

1 

70343  1 

src 

16.  0 

16.8 

105 

1 

t 

RECORDS 

45 

1 

70343  2 

SEC 

11.  0 

8.9 

80 

1 

RECORDS 

45 

1 

70343  3 

SEC 

13.  0 

12.7 

92 

1 

92 

1 

RECORDS 

45; 

135 

1 

1  . 

_ /I _ /t _ 

./I _ 

../I 

_ 

../I _ /I _ 

„/l _ /!-. 

—’I _ r\ 

_ /I _ 

_/ 

EICURJ  7 


'  V  V-P**#?**?*?**  WV  W1  ■"?  ,75*<r‘VS!5f*7'S3TO'»V 


UNS*T  =  Nuaber  of  students  who  have  not  yet  completed 
Onit  Test  satisfactorily  (0  of  50). 

-  THt/1  *  Nuaber  of  students  uho  late  atteapted  the  Onit 
Test  only  once  (50  of  50)  . 

-  TfiY/2  -  Nuaber  of  students  uho  have  atteapted  the  Onit 
Test  twice. 

-  TBY/3+  «  Nuaber  of  students  uho  reguirad  three  or  aore 
atteapts. 

-  S AT  1 82/PCT  *  *  of  students  uho  coapleted  Onit  Test 
satisfactorily  in  first  tuo  tries  (100*). 

-  SAT/ADJ  *  Adjusted  SAT1 S2/PCT  coluan.  Adjastaent  is 
aade  by  lowering  the  SAT  16 2/PCT  figure  proportional  to 
the  PASS  PAIL  coluan  to  reflect  the  affect  of  a  test 
with  a  lowered  criteria  (70).  The  SAT/AD J  figure  is  the 
nuaber  that  PIS  II  usos  as  the*  Course  Quality  - 
Effectiveness  indicator. 

Now  lets  exaiine  the  overall  report  result  for  Course 
•70339.  Onit  #1  has  an  effectiveness  indicator  of  70.  Onit 
•2  is  48  and  Onit  #3  is  100.  The  indicator  of  average 
effectiveness  for  the  entire  course  is  73.  The  cause  of  the 
lou  effectiveness  indicator  for  Onit  *2  (48)  should  be 
investigated  and  iaproveaents  aade. 


Notice  that  the  RECORD  COUNT  for  each  unit  of  Course 

•70339  is  50.  This  teans  the  course  is  not  a  "brand  new" 

course  and  the  "rolling  50"  storage  function  .s  effect.  The 
data  shown  on  this  report  is  for  the  icst  recent  50 

students.  In  contrast,  look  at  the  RECORD  COOMT  for  Course 

•  70343.  It  is  a  new  course  and  has  less  than  50  students. 


1 


The  second  report  shown  is  the  Course  Quality  aeasureaent 
of  Efficiency  and  is  called  TIP.E/ONIT.  The  PCT/AVG  coluan 
shows,  the  average  unit  coapletion  tine  of  thi  students  cs 
file  (coluan  UNIT/AfG)  ,  as  a  percentage  of  the  planned  unit 
tiae  (coluan  PLAN  TIN E)  . 


Pot-  Course  *70339, 
coapletion  tiae  is  8.1 
unit  tiae  of  9.1'.  This 
students  are  coapleting 


Unit  1,  the  actual  average  unit 
hours  which  is  871  of  t.he  planned 
unit  appears  efficient  because  the 
faster  than  planned. 


a 

I 


i 
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1 
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1 
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i 
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1 
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The  third  report  shoes  Course  Quality  -  Acceptance  in  the 
for*  of  reuponses  to  the  Student  Opinion  Questionnaire  (SOQ) 
(Figure  8  -  Top).  The  RATING  column  shows  the  arithmetic 

aean  of  the  responses  to  an  overall  opinion  question,  on  a 
scale  of  five  ranging  fro*  ?ery  Good  (index  of  5)  to  Very 
Poor  (index  of  1)  plus  a  null  response  of  Ho  Opinion.  If 
all  students  rated  a  unit  as  Average  the  RATING  would  be 
3.00. 

I QiE  CsUil  JUMl;?:  The  proceeding  TIRB/ONIT  report  showed 
"suaaary"  data.  It  would  nor  sally  be  used  to  get  a  general 
idea  of  course  efficiency.  The  data  that  makes  up  the 
suaaary  report  can  be  shown  on  a  "detail*  report  (Figure  8  - 
Bottom).  The  example  shows  TIRE  detail  that  would  normally 
be  used  by  the  course  author  when  investigating  a  pro  ties  or 
by  research  personnel.  Each  of  the  50  students  on  file  is 
shown  as  an  individual  report  line.  Shown  for  each  student 
is;  study  tiae  on  terminal,  study  tiae  off  terminal,  and 
tine  for  the  first  and  lest  unit  test  tries. 

Similar  "detail"  data  reports  are  available  showing 
result.'?  of  each  test  itea  (effectiveness)  and  responses  to 
tt«  ?u  detailed  opinion  questions  (acceptance). 
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EXAMPLE  0?  AUTHOR  USE 


A  new  course  has  been  developed  and  the  author  is 
interested  in  it*s  resultant  quality. 

1:  The  author  obtains  the  IQRP  summary  reports  for 
TEST/RBSOLT  (effectiveness),  TISE/UNIT  (efficiency),  and 
SOQ/BATE  (acceptance)  via  a  terainal.  In  this  etaaple  only 
the  SOQ/RATE  report  will  be  used.  It  shows  low  acceptance 
for  Unit  #2  (2.64).  The  report  indicates  that  Units  II  and 
13  are  inch  sore  acceptable  than  Unit  *2  (Figure  9). 

2:  In  order  to  investigate  the  low  acceptance  of  Unit 
#2,  the  author  obtains  an  IQRP  detail  report  of  student 
responses  to  the  detail  questions  on  the  Student  Opinion 
Questionnaire.  Of  the  fifteen  Priaary  reasons  why  students 
eight  not  like  Unit  12,  Reason  14  has  the  Best  responses. 
The  detail  report  shows: 

•  Of  the  17  students  who  said  the  unit  was  Poor  cr  Very 
Poor,  16  indicated  Reason  14  ("...study  activities  are 
poor...")  as  a  cause.  They  did  not  indicate  any  of  the 
other  fourteen  Priaary  reasons,  such  as  "poor  tests"  or 
"poor  directions"  as  major  causes  of  their  low 
acceptance. 

•  The  16  students  who  said  "study  activities  are  poor* 
were  administered  a  screen  with  the  following  Secondary 
reasons: 

-  S22  Displayed  study  material  not  helpful  in 
learning. 

-  S23  Referenced  study  aaterial  not  helpful  in 
learning. 

-  S24  Too  »any  questions  -  problems  -  exercises. 

S25  Too  few  questions  -  Problem  -  exercises. 

The  report  show s  that  eost  (14)  of  the  dissatisfied 
students  picked  secondary  Reason  tS23  as  the  cause.  Ihe 
author  now  suspects  that  there  is  a  problem  with  the 
"referenced  study  uaterial"  in  Unit  #2. 

•  Unit  12  is  several  hours  long,  has  several  objectives, 
and  has  many  different  manuals  that  are  referenced  for 
study  activity.  Which  manuals  rre  the  students 
complaining  about? 
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report  via  a  terainal.  This  report  shoes  accusulated 
student  responses  to  EACH  SCREEN,  for  EACH  OBJECTIVE  in  Unit 
12  (Figure  9  *  Bottoe) .  The  data  for  Objectives  *1  thru  13 
was  exaained  and  looks  OK.  However,  the  data  for  Objective 
•4  (Shown  in  the  exaeple  report)  identifies  a  problee.  The 
coluen  headings  CA  and  BA  stand  for  "Correct  Answer"  and 
"Hrong  Answer".  This  objective  has  fourteen  display 
screens.  Notice  that  at  Screen  (LABEL)  *61a411  and  *61n422 
eost  students  are  responding  with  wrong  answers  rather  than 
correct  answers  (61a412:  CA's  *  31  and  VA*s  *  52).  Further 
investigation  on  this  objective  solved  the  low  acceptance 
problee.  The  annual  reference  for  this  objective  was 
incorrect.  The  students  were  told  to  study  one  subject  but 
were  tested  on  another. 

SJliftlU*  The’  author  used  reports  at  three  levels  of  detail 
to  identify  &  "low  student  acceptance"  problea  and  pinpoint 
it*s  cause.  The  reports  were  available  iaaediately,  on-line, 
via  a  CAI  terainal. 
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HSADQUASTE8S  T BACKING 

Headquarters  began  using  the  Course  Quality  Reas'treaent 
systea  to  track  "established"  units  in  January  1977.  Onits 
that  hate  had  sore  than  fifty  student  coapletions  are 
considered  "established".  Here  are  the  results  of  the  first 
teo  quarters  (IQ77  6  2Q77)  of  tracking: 

•  Huaber  of  courses  currently  being  tracked  *  33. 

•  Kuaber  of  units  currently  being  tracked  »  104. 

•  Effectiveness: 

-  TEST/RES01T  criteria  currently  set  at  751. 

-  Onits  aeeting  criteria  *  62  of  104. 

-  Best  unit  *  100*.  Horst  unit  *  22*. 

-  *  of  units  in  criteria  1st  quarter  *  54*. 

*  of  units  In  criteria  2n>)  quarter  *  60*. 

•  Efficiency: 

-  TIHE/UHIT  criteria  currently  set  at  75*  to  110*. 

-  Onits  aeeting  criteria  *  51  Of  104. 

*  Best  unit  *  100*.  Horst  unit  *  196*. 

-  *  of  units  in  criteria  1st  quarter  *  43*. 

-  *  of  units  in  criteria  2nd  quarter  *  49*. 

•  Acceptance: 

-  SOQ/RATE  criteria  currently  set  at  3.25. 

-  Onits  aeeting  criteria  *  44  of  104. 

-  Best  unit  *  4.22.  Horst  unit  *  2.42. 

-  X  of  units  in  criteria  1st  quarter  *  40*. 

*  of  units  in  criteria  2nd  quarter  *  42*. 


1235 


1 


•"  -»  icy ww^wywqgjja^) 


.»  w^jV,  jS&i  i.^*'****  ?'*,  *v*"T  -  \t23  »'-''£».  *»■  4'SA: 


sK- 


EXA HP  LB  RESEARCH 

The  Course  Quality  aea sure  sent  systea  provides  tiaely 
answers  to  research  type  questions.  Here  are  a  few  of  the 
aany  questions  answered  thus  far. 


i 


§ 


i. 


PLANNED  ONIT  TIME  VS.  ACCEPTANCE 

PIS  II  has  a  course  developaent  guideline  that  recoaaends 
that  the  author  design  each  unit  to  require  eight  hours,  or 
less,  learning  tine.  Question:  tihat  is  the  relationship  of 
student  acceptance  of  units  eight  hours  or  less,  and  of 
units  over  eight  hours?  The  two  IQBP  reports  show  that 
student  acceptance  appears  better  on  units  that  are  within 
the  eight  hour  guideline  (Figure  10  -  Top): 

•  Planned  unit  tiae  (P/OT)  not  greater  than  (NG)  8.0  hours 
results  in  acceptance  (SOQ  RATING)  of  3.31. 

•  Planned  unit  tiae  (P/UT)  greater  than  (GT)  8.0  hours 
results  in  acceptance  (SOQ  RATING)  of  3.21. 


SLON/PAST  STUDENTS  VS.  ACCEPTANCE 

Question:  Nhat  is  the  relationship  of  student  acceptance 
to  how  fast/slow  the  student  learnt.?  As  the  three  reports 
show,  it  appears  that  students  who  coaplete  quickly  have  a 
aore  favorable  of  the  unit  (figure  10  -  Bottoa): 

•  Student  conpletion  tiae  less  than  (LT)  75*  of  planned 
tiae  (PCT/AVG)  results  in  acceptance  (SOQ  BATING)  of 
3.40. 

•  Student  coapletion  tiae  within  liaits  (HL)  of  75*  and 
125*  of  planned  tiae  (PCT/AVG)  results  in  acceptance 
(SOQ  RATING)  of  3.2a. 

« 

•  Student  coapletion  tiae  greeter  than  (GT)  125*  of 
planned  tiae  (PCT/AVG)  results  in  acceptance  (SOQ 
RATING)  Of  3.08. 
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EXAMPLE  OF  OM-LIHB  RESEARCH 

One  o£  the  course  development  departments  produced  an  PIS 
II  course  (70555)  which  used  an  experieental  adjunct 
publication. 

•  The  normal  adjunct  publication  for  Course  #70555  would 
contain  "outline"  information  only  and  consist  of  about 
30  pages. 

•  The  experieental  adjunct  publication  contained  printed 
duplicates  of  all  the  teaching  material  presented  on  the 
display  screens  and  consisted  of  about  300  pages. 

•  Students  whb  complete  course  170555  go  to  an  Education 
Center  for  a  conventional  lecture  -  laboratory  class. 
Instructors  who  teach  the  classes  said  the  incoming 
students  tell  then; 


”...aost  students  study  in  the  experimental 
publication  instead  of  using  the  display  screens..." 

"...students  like  the  publication  better  than  the 
display  screens..." 

"...students  using  the  publication  learn  faster..." 


Course  #70555  has  8  units  and  at  50  student  completion 
records  per  unit  yields  a  total  of  400  student-unit 
completions  for  analysis.  Two  TQBP  inquires  were  use!  to 
separate  the  400  records  ^nto  th^ee  groups  (Figure  11): 

•  Students  who  did  most  of  their  studying  on  the  terminal 
and  made  little  use  of  the  experimental  publication. 
These  students  had  off-  terminal  study  time  of  less  than 

1  hour  (T/OFF  LT  1.0).  See  1st  snd  3rd  example  reports. 

•  Students  who  did  most  of  their  studying  off  the  terminal 
and  relied  primarily  on  the  experimental  publication. 
These  students  had  off-  terminal  study  time  of  more  than 

2  hours  (T/OFF  GT  2.0).  See  2nd  and  4th  example 
reports. 


«  Students  who  used  both  the  terminal  and  the  experimental 
publication.  These  students  had  an  off-terminal  study 
tine  of  between  1  and  2  hours  and  were  not  included  in 
the  analysis. 


Hon  lets  coup are  the  report  data  to  the  three  aforesaid 
"consents": 

•  "...aost  students  study  in  the  experiaental 
publication...". 

Look  at  the  BBCOBD  COOHT  for  reports  1  and  2. 

Host  students  (339)  used  the  terainal.  Pea  used  the 
experiaental  publication  (34)  and  few  used  both  (400 
-  334  -  34  *  32). 

This  consent  did  not  represent  the  actual  situation. 

•  "...students  like  the  publication  better.,.". 

Look  at  the  3ATIMG  on  reports  1  and  2.  There  is  no 
significant  difference  in  student  acceptance  (3.09 
vs.  3.06)  between  the  two  groups. 

This  coaaent  did  not  represent  the  actual  situation. 

«  "...students  using  the  experiaental  publication  learn 
faster. . . 

Look  at  the  PCT/AV  on  reports  3  and  4  (841  vs. 
2061)  . 

*  Students  using  the  terainal  couplet*  in  841  of  the 
planned  tiae. 

-  Students  using  the  experiaental  publication  ccaplete 
such  aore  slowly  and  require  2061  of  the  planned 
tine. 

This  coaaent  did  not  represent  the  actual  situation. 

The  three  consents  resulted  f roa  isolates  "subjective" 
opinions.  The  Course  Quality  aeasureaent  systes  provided 
the  "objective"  analysis  that  shoved  the  consents  to  be 
invalid. 
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MEDIA  HESEAECH  IX  PROCESS 

The  Course  Quality  aeasureaent  system  is  being  used  as 
the  aeasureaent  vehicle  in  a  two  part  "aedia  study"  now  in 
process  atteapting  to  detersine: 

•  Applicability  of  "aedia  enhance  sent" 

•  "Student  preference"  for  types  of  "coaplex"  aedia 

Part  .j:  Question:  Can  a  typical  "non  coaplex  aedia"  unit  b« 
enchanced  (quality  iaproved)  by  application  of  the  aoet 
appropriate  coaplex  aedia?  An  existing  typical  course 
consisting  of  7  units  is  being  used  to  execute  the  study. 
Onit  #6  of  the  course  eas  originally  designed  to  use: 

•  Display  terainal. 

«  Reference  to  product  service  annuals. 

•  Noraal  adjunct  course  publication  (course  outline). 

A  review  by  aedia  experts  identified  the  potential  for 
converting  part  of  the  display  text  and  part  of  the  text 
referenced  in  annuals  to  a  filastrip  and  audio  cassette 
presentation.  This  part  of  the  study  till  use  the  original 
aaterial  for  a  control  group  and  aaterial  "enhaacsd"  with 
the  filastrip/cassette  for  the  experiaental  group.  Beth 
groups  will  use  the  saae  objectives  and  test  iteas.  Tho 
null  hypothesis  is:  "...there  will  be  no  difference  in 
Course  Quality  between  the  groups...". 

i:  Question:  What  type  of  coaplex  aedia  do  students 
prefer?  Onit  13  was  originally  resigned  with  a  filastrip  and 
audio  cassette  presentation.  A  review  by  aedia  experts 
indicated  that  the  presentation  could  be  converted  to  a 
video  presentation  and  to  a  ainipub*  without  gain  cr  loss  of 
educational  effectiveness.  The  idea  was  to  deteraine 
student  preference  of  various  types  of  coaplex  aedia. 

"Hinipub  *  Pocket  novel  size  publication  containing 
inforaation  froa  the  storyboard  used  in  audio/visual 
presentations. 


' jj  ^  <ty<SSSt3fei,  -rJ'3^V<,;<  <?►  ^•‘-•..'^'S  ^*cv  x 


Three  experimental  groups  sill  be  used.  The  groups  use 
the  save  objectives  and  test  itees. 

•  Group  1:  Filestrip  and  audio  cassette 

«  Group  2:  Video  cassette 

•  Group  i:  flinipub 

The  null  hypothesis  is:  "...there  will  be  no  difference 
student  acceptance  between  the  groups. The  two  other 
Course  Quality  components,  effectiveness  and  efficiency  will 
also  be  Measured  to  ensure  that  "student  preference"  does 
net  affect  overall  quality. 
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Introduction 


The  Double  Horseshoe  Method  (DHM)  was  devised  to  implement  the  prin¬ 
ciples  of  performance-oriented  language  instruction.  The  language  in¬ 
struction  was  required  for  the  improvement  of  spoken  English  by  Korean 
soldfors  who  augment  the  U.S.  Army  in  Korea.  [These  soldiers  are  called 
KATUSA's,  for  Korean  Augmentation  To  the  U.S.  Army.  They  number  approxi¬ 
mately  7000  and  provide  a  vital  link  between  U.S.  forces  and  those  of 
the  Republic  of  Korea.]  The  Double  Horseshoe  Method  is  currently  employed 
in  all  classes  at  the  in-processing  center  for  KATUSA's  and  in  units  of 
the  U.S.  Second  Division. 

This  paper  will  describe  the  nature  of  the  DHM  and  the  way  it  achieved 
the  goals  of  performance  oriented  training.  Although  the  original  purpose 
of  DHM  was  to  improve  the  spoken  English  ability  cf  soldiers  for  whom 
English  is  a  second  language,  the  method  could  be  applicable  to  a  wide 
range  of  training  topics.  The  description  of  the  training  problem  and 
how  the  Double  Horseshoe  Method  coped  with  it  will  illustrate  the  poten¬ 
tialities  and  limits  of  the  method. 

The  Training  Problem 

The  general  problem  that  was  approached  in  Korea  was  (and  is)  a  long¬ 
standing  ones  how  to  teach  the  KATUSA's  spoken  English  so  that  they  are 
able  to  learn  military  skills  in  the  American  Army  and  be  productive  sol¬ 
diers.  The  constraints  on  *he  training  situation  were  severe  ard  allowed 
only  a  modest  dent  in  this  problem.  At  the  KATUSA  Processing  Center 
(KPC)  where  the  developmental  work  on  the  DHM  was  done.  Korean  soldiers 
fresh  from  basic  training  in  the  ROK  army  undergo  three  weeks  of  prepar¬ 
ation  for  service  in  the  American  army.  The  median  size  of  a  cycle  fill 
has  been  240  men.  English  conversation  training  is  carried  out  in 
eighteen  two-hour  blocks,  'formally,  six  GI  instructors  are  available  for 
each  cycle.  Their  skill  in  language  instruction  has  been  acquired  through 
on-the-job  training.  They  are  not  bi-lingual. 

The  students  are  selected  by  the  ROK  army.  Serving  the  three  year 
military  obligation  ns  a  KATUSA  is  considered  much  more  desireable  than 
doing  it  as  a  ROK  soldier.  The  precise  basis  of  the  selection  process 
is  unknown  other  than  these  men  are  volunteers  and  are  supposed  to  have 
demonstrated  the  ability  to  read  and  write  English.  According  to  the 
Korean  senool  system,  middle  school  graduates  receive  two  years  of  English 
instruction  and  high  school  graduates  as  much  as  an  additional  four  years. 
A  typical  class  at  KPC  consists  of  one-fourth  middle  school,  one-half 
high  school,  and  one-fourth  college  graduates.  In  essence,  the  training 
task  was  to  use  the  eighteen  two-hour  blocks  to  orient  the  students  to 
hearing  and  making  English  sounds.  Perhaps  nowhere  was  performance  ori¬ 
ented  training  more  in  order.  For  nearly  all  of  the  students  it  was  the 
first  dialogue  with  a  native  speaker. 


The  Double  Horseshoe  Method 

Doing  performance  oriented  training  in  classes  of  25-45  students 
was  the  major  training  management  problem.  Previously,  the  normal  com¬ 
plement  of  six  instructors  used  the  lecture  method,  calling  on  students 
from  time  to  time  to  respond  individually  or  in  chorus.  Each  student 
had  a  book  which  contained  a  formidable  amount  of  material  for  which  he 
was  held  accountable  in  an  end-of-course  test. 

Observations  and  independent  end-of-course  tests  by  The  /;®y  Research 
Institute-Far  East  Field  Unit  (AR1-FE)  made  it  clear  that  the  current 
training  was  not  effective  in  improving  the  ability  of  the  students  in 
spoken  English.  The  principles  of  FM  21-6,  which  concerns  performance 
oriented  training,  were  seen  as  highly  applicable;  applying  them  under 
the  circumstances  resulted  in  the  development  oi  the  DHM. 

The  DHM  took  its  name  from  the  physical  configuration  of  students  in 
the  classroom.  They  were  arranged  in  two  horseshoe-shaped  rows  with  the 
instructor  at  the  open  end.  Each  student  in  the  inner  ring  was  perma¬ 
nently  paired  with  a  student  in  the  outer  ring.  Several  purposes  were 
accomplished  by  this  configuration.  One  was  that  the  instructor's  model 
behavior  was  close  to  the  students.  His  demonstration  of  the  target 
skills  was  easily  seen  and  heard.  A  second  was  that  during  the  perfor¬ 
mance  by  the  students  of  the  training  material,  the  instructor  attended 
only  to  the  inner  ring.  The  instructor  was  thus  able  to  insure  that 
every  student  performed  the  desired  behavior  every  time  it  was  required. 

A  third  purpose  was  the  demonstration  encouragement  of  peer  coaching. 
As  each  student  in  the  Inner  ring  was  called  on,  his  partner  in  the  oo^er 
ring  was  ready  to  help,  usually  at  the  direction  of  the  instructor.  Dur¬ 
ing  the  initial  class  sessions,  the  instructor  demonstrated  what  and  when 
to  coach.  After  the  instructional  material  was  covered  with  the  inner 
ring  of  students,  the  inner  and  outer  rings  exchanged  seats  and  the  pro¬ 
cess  was  repeated.  In  addition  to  obtaining  the  effects  of  repetition, 
the  procedure  contributed  to  the  building  of  the  working  relationships 
between  members  of  the  student  pairs.  This  relationship  was  viewed  as 
essential  to  promoting  practice  of  the  training  tasks  outside  of  classroom 
hours.  A  further  device  used  to  strengthen  the  peer  relationship  was  the 
praise  or  punishment  by  the  instructor  of  the  collective  pair. 

Schodul tng 

After  the  first  training  session,  during  which  the  organization  of 
the  class  occupied  the  first  hour,  a  typical  two-hour  block  of  training 
consisted  of  the  following: 

1.  Test  students  on  material  from  the  previous  session,  40  -  50 
minutes . 

2.  Break.  10  mlnu'  cs 

3.  Train  on  new  material,  60  -  70  minutes. 


1246 


Keeping  up  this  schedule  was  the  largest  problem  encountered  in  the 
early  stages  of  OHM  development.  This  was  handled  in  two  ways?  1)  the 
amount  of  material  to  be  covered  for  each  session  was  cut  down  to  a  com¬ 
fortable  time-fit;  2)  instructors  were  told  not  to  spend  excessive 
amounts  of  class  time  on  individual  students. 

The  importance  of  adhering  to  the  pattern  of  the  schedule  came  from 
several  considerations.  One  was  that  of  insuring  that  every  man  would 
get  his  chance  to  perform  every  behavior  under  the  instructor's  super¬ 
vision.  Another  was  that  by  separating  the  training  period  from  the  test¬ 
ing  period,  the  illusory  gains  in  ability  due  purely  to  immediate  memory 
were  eliminated.  The  time  gap  also  gave  peer  instruction  a  chance  to 
occur  outside  of  class.  Finally,  by  requiring  a  test  period  after  each 
training  period,  the  instructor  wan  able  to  gain  a  quick  gauge  of  the  pro¬ 
gress  (or  lack  of  it)  by  his  students.  This  allowed  the  instructor  to 
discover  problems  almost  imaediately  and  to  correct  them  as  they  became 
known  rather  than  be  surprised  at  the  end  of  the  course. 

Results  And  Discussion 


The  major  question  with  any  training  method  is  how  well  does  it  work. 

As  applied  to  the  training  of  spoken  English  to  KATUSA's,  the  DHM  was  a 
highly  successful  adaptation  of  the  principles  of  performance  oriented 
training.  Students  performed  the  required  training  tasks  up  to  the 
training  standards.  No  experiment  was  carried  out,  nor  did  one  seem  neces¬ 
sary,  to  shew  what  was  readily  observable.  There  was,  however,  one  small 
worry.  Was  there  anything  lost  in  switching  from  conventional  instruction 
to  the  DHM?  Thu  reason  for  the  concern  va*  the  large  reduction  in  volume 
of  material  that  was  necessary  in  changing  to  performance  oriented  train¬ 
ing.  Possibly  the  better  students  had  been  increasing  t'ncir  vocabularies 
even  if  they  weren't  Improving  their  abilities  in  spoken  English. 

To  determine  if  this  was  the  case,  the  pre  -  and  posttraining  scores 
on  a  written  test  called  the  Five  Minute  English  Word  Test  (5EVTT)  were 
examined  during  the  conversion  to  DHH.  The  5HWT  required  that  a  student 
write  down  as  many  English  words  plus  their  Korean  translations  as  he 
could  In  five  minutes.  In  Table  1  arc  shown  the  average  gain  scoies  for 
four  cycles  at  the  KATUSA  Pro'-.ssing  Center.  Cycle  1  was  the  first  stage 
in  the  implementation  process  and  only  three  of  the  six  instructors  were 
attempting  to  use  DHM.  By  Cycle  4,  all  of  the  instructors  were  using  DHM. 
That  the  average  gain  scores  progress  steadily  as  the  DHM  was  Implemented 
is  not  offered  as  evidence  that  the  DHM  achieves  the  training  goals  that 
were  established.  These  goals  were  achieved  daily  in  the  classroom.  In¬ 
stead,  the  results  In  Table  1  show  that  in  spite  of  reduced  and  delimited 
material,  no  losses  were  experienced  on  n  measure  which  would  appear  to 
favor  the  conventional  method  of  instruction  previously  practiced.  The 
steady  increase  of  gain  scores  on  a  written  test  like  5EVTT  was  a  bonus 
which  could  have  been  due  to  more  inspirational  instructors  and  better  mo¬ 
tivated  students  as  well  as  the  Double  Horseshoe  Method  of  training. 
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Table  1 


RESULTS  OF  FIVE  MINUTE  ENGLISH  WORD 
TEST,  AVERAGE  SCORES  BY  CYCLE 


Cycle 

N 

Before 

Training 

After 

Training 

Gain 

X  Gain 

7701 

2A1 

30.5 

36. A 

5.9 

19.3 

7702 

2A4 

2A.8 

31.6 

6.8 

27. A 

7703 

200 

27.6 

36.7 

9.1 

33.0 

77C4 

99 

15.5 

25.5 

10.0 

6A.5 

Conclusions 

The  purpose  of  this  paper  has  been  to  describe  a  way  of  managing 
performance  oriented  training  in  large  classes.  The  Double  Horseshoe 
Method  was  developed  and  implemented  for  the  purpose  of  improving  the 
spoken  English  ability  of  Korean  soldiers  who  augment  the  US  Army. 

When  used  for  this  purpose,  the  DHM  was  effective  on  several  counts: 

1)  the  G.I,  instructors  were  able  to  carry  it  cut; 

2)  student  performance  on  all  the  material  was  monitored  session 
by  session; 

3)  peer  coaching  was  demonstrated  in  the  classroom; 

A)  scores  on  a  measure  thought  o  favor  conventional  instruction 
went  up  during  the  implementation  period; 

5)  costs  were  held  to  the  same  level. 

Using  the  DHM  for  subject  natter  other  than  spoken  English  seems 
quite  feasible.  The  considerations  behind  its  adoption  in  a  particular 
situation  are  likely  to  be  similar  to  those  encountered  at  Jie  KATUSA 
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Processing  Center,  i.e.,  material  which  lends  itself  to  performance 
oriented  training,  an  obvious  need  for  performance  oriented  training, 
2G  to  40  students  per  instructor,  and  perceived  benefits  from  peer 
coaching.  If  these  conditionc  are  present,  the  Double  Horseshoe 
Method  offers  a  viable  way  of  managing  the  delivery  of  performance 
oriented  training. 
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POSSIBLE  STRATEGIES  FOR  ESTABLISHING 
TRAINING  PRIORITIES 


Arthur  C.  F.  Gilbert,  Ph.D. 
Reynold  0.  Waldkoetter,  Ed.D. 


U.S.  Army  Research  Institute  for  the  Behavioral  and  Social  Sciences 

Alexandria,  Virginia  22333 


A  paper  by  Waldkoetter  et  al  (1976)  reported  the  utility  of  the  four 
factor  training  priority  model  presented  by  Mead  (1975)  in  an  Army 
setting.  In  that  paper  the  efficiency  of  the  four  factor  model  was 
demonstrated  by  a  multiple  correlation  of  .88  between  the  four  factors 
and  type  of  training  evaluated  along  a  five  point  scale.  The  five 
points  on  the  scale  corresponded  to  five  types  of  training  where  resident 
school  training  carried  a  value  of  "5",  formal  unit  training  had  a  value 
of  "4",  con-resident  courses  were  assigned  a  value  of  "3",  on-the-job 
training  had  a  value  of  "2",  and  a  value  of  "1"  was  assigned  in  those 
instances  where  no  training  was  selected  for  a  task. 

The  early  analysis  reported  by  Waldkoetter  et  al  was  baaed  on  the 
assumption  that  degree  of  formalization  of  training  could  be  expressed 
as  a  continuous  variable  extending  from  a  high  degree  of  formalization 
represented  by  resident-school  training  to  lack  of  formalization 
represented  by  no  training  being  required.  Obviously  the  two  extreme 
ends  of  the  scale  pose  no  difficulty  in  terms  of  definition;  however, 
the  ordering  of  the  three  Intermediate  types  of  training  requires 
certain  assumptions  about  the  degree  of  formalization. 

The  purpose  of  this  paper  is  to  present  possible  alternate  strategies 
that  sdght  be  used  in  the  data  analyses  that  are  not  based  on  any 
assumptions  about  the  extent  to  which  the  different  types  of  training 
are  formal  in  nature. 


1 

The  views  expressed  in  this  paper  are  those  of  the  authors  and  do  not 
necessarily  reflect  the  views  of  the  Army  Research  or  those  of  the 
Department  of  the  Army. 
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PROCEDURE 

Waldkoetter  et  al  described  the  data  collection  procedure  in 
obtaining  the  task  analysis  data  on  MOS  76V  Equipment  Storage  Specialist 
tasks.  A  task  list  of  183  items  was  administered  to  80  non-commissioned 
officers  and  the  subjects  were  required  to  rate  each  task  on  three  types 
of  training  scales;  these  scales  were  Task  Learning  Difficulty  (TLD), 
Task  Delay  Tolerance  (TDT) ,  and  Consequences  of  Inadequate  Performance 
(CIP) .  Each  scale  was  a  seven-step  scale.  For  the  Task  Learning 
Difficulty  Scale  and  for  the  Consequences  of  Inadequate  Performance 
Scale,  the  high  end  of  the  scale,  "7",  corresponded  to  a  high  degree  of 
learning  difficulty  or  indicated  that  the  consequences  of  inadequate 
performance  was  great.  For  the.  Task  Delav  Tolerance,  a  value  of  "1" 
indicated  that  the  task  should  be  performance  immediately  once  the  need 
of  its  performance  had  been  perceived  by  the  incumbent. 


Subjects  were  asked  to  indicate  the  type  of  training  that  thev 
considered  appropriate  for  each  task  by  indicating  whether  the  task 
should  be  taught  in  resident  school,  formal  unit  training,  non-resident 
school  training,  on-the-job  training,  or  if  it  did  net  require  any 
training.  For  the  purpose  of  analysis  in  the  research  by  Waldkoetter 
ct  ftl,  these  five  vype  of  training  categories  were  assigned  values  of 

"4",  "3",  "2" ,  and  "l"  respectively. 

As  in  previous  research  mean  values  were  derived  for  each  task  on 
the  Task  Learning  Difficulty  Scale,  the  Consequences  of  Inadequate 
Performance  Scale,  and  on  the  Task  Delay  Tolerance  Scale.  However,  for 
the  analyses  reported  hare  the  Task  Delav  Tolerance  Scale  was  reversed 
in  value  so  that  a  value  of  "7"  denoted  a  short  delay  tolerance.  The 
same  data  on  the  percent  of  members  performance  each  task  in  the  MOS 
was  also  used  as  the  fourth  factor.  Instead  of  obtaining  a  mean  value 
on  the  type  of  training  scale  as  in  the  previous  research,  the  frequency 
of  subjects  who  placed  the  task  in  each  of  the  five  type  of  training 
categories  was  tabulated  for  each  task. 


RESULTS  AND  DISCUSSION 

In  Table  1  the  correlations  among  all  variables,  both  predictors 
and  criteria,  are  shown.  Also,  in  this  table  the  correlations  between 
the  five  point  type  of  training  scale  (Waldkoetter  et  al,  1976)  and  all 
of  the  other  variables  are  shown. 

The  initial  data  analysis  consisted  of  deriving  the  canonical 
correlations  between  the  set  of  predictor  variables  (i.e.,  Task  Learning 
Difficulty,  Task  Delay  Tolerance,  Consequences  of  Inadequate  Performance, 
and  Percent  of  Members  Performing)  and  the  set  ot  type  of  training 
criterion  variables  (i.e..  Resident  School  Training,  Formal  Unit  Training, 
Non-Resident  School  Training,  On-the-Job  Training,  a?.d  No  Training). 

Three  sets  of  canonical  variates  were  derived  yielding  canonical 

correlations  of  .91,  .49,  and  .86  all  of  which  were  significant  bevond 
the  .01  level. 
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CORRELATIONS  AMONG  THE  TYPE  OF  TRAINING  VARIABLES 
AND  THE  FOUR  PREDICTOR  VARIABLES 
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The  first  set  of  canonical  variates  indicated  that  tasks  judged 
high  in  Task  Learning  Difficulty  tended  to  be  perceived  as  requiring 
resident  school  training.  The  second  set  of  variates  revealed  that 
tasks  that  were  judged  as  being  difficult  to  learn  and  which  a  large 
percentage  of  members  perform  tended  to  be  associated  with  resident 
school  training  or  on-the-job  training.  There  was  some  tendency  to 
view  tasks  in  which  the  Consequences  of  Inadequate  Performance  were 
considered  high  as  not  being  appropriate  for  non-resident  school  training. 
Hie  third  set  of  canonical  variates  tended  to  indicate  that  for  those 
tasks  in  which  a  low  percent  of  incumbents  perform  and  whan  the  task 
delay  tolerance  is  not  important  neither  on-the-job  training  or  resident 
school  training  is  perceived  as  being  appropriate. 

A  factor  analysis  was  performed  for  the  four  predictor  variables 
T~o  factors  emerged  in  this  analysis.  The  first  factor  was  identified 
by  Task  Delay  Tolerance  and  Consequences  of  Inadequate  Performance  while 
the  second  factor  was  bipolar.  At  one  end  it  was  identified  by  task 
learning  difficulty  and  at  the  other  by  percent  of  members  performing. 

The  second  factor  analysis  Involved  the  set  of  criterion  variables 
(or  type  of  training  variables).  Two  factors  were  also  extracted  from 
this  set.  Doth  of  these  factors  were  bipolar.  The  first  factor  was 
identified  at  one  end  by  resident  school  training  and  at  the  other  by  on- 
the-job  training.  Both  non-resident  school  training  and  the  no  training 
variables  loaded  substantially  (i.e.,  above  an  absolute  value  of  .AO)  on 
this  first  factor.  The  second  bipolar  factor  was  identified  at  the 
positive  end  by  formal  unit  training  and  at  the  other  end  bv  no  training 
being  required  and  by  non-resident  school  training. 

Five  regression  analyses  were  performed  using  each  of  the  five  type 
of  training  variables  as  the  criterion.  In  each  analysis  all  of  the  four 
predictor  variables  were  used.  The  beta  weights  derived  in  each  of 
those  analynes  and  the  corresponding  multiple  correlation  and  squared 
multiple  correlations  are  shown  in  Table  2.  For  the  sake  of  comparison 
the  same  data  are  shown  for  the  five  point  type  of  training  scale 
reported  by  Waldkoetter  et  al.  Four  of  the  multiple  correlations  were 
significant  at  the  .01  level,  while  the  fifth  predicting  formal  unit 
training  was  significant  at  the  .05  level.  The  multiple  correlations 
ranged  from  .8538  fo.*  resident-school  training  to  .3367  for  formal  unit 
training. 

The  final  analysis  of  the  data  involved  applying  the  regression 
weights  in  the  fiv^  analyses  of  regression  and  the  weights  derived  from 
the  Waldkoetter  et  al  analysis  to  obtain  a  predicted  type  of  training 
for  each  task.  The  correlations  among  the  six  predicted  types  of 
training  were  then  computed  across  the  183  tasks.  The  matrix  of 
correlations  resulting  from  these  computations  are  shown  in  Table  3. 
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The  results  of  the  canonical  correlation  analysis  indicated  the 
trends  that  exist  in  the  data  In  highlighting  the  relationship  that 
exists  between  difficulty  in  the  learning  of  a  task  and  its  appropriate¬ 
ness  for  resident  school  training.  Another  finding  of  interest  from 
this  analysis  centers  on  the  fact  that  if  a  task  is  considered  as  being 
difficult  to  learn  and  if  it  is  being  performed  by  a  large  percent  of 
incumbents  then  it  is  viewed  as  being  appropriate  for  either  resident 
school  training  or  on-the-job  training. 


Factor  analysis  of  the  type  of  training  variables  Indicated  two. 
Instead  of  a  single  factor  emerging,  thus  showing  the  possibility  of 
ordering  these  variables  along  a  single  dimension  may  not  be  entirely 
valid.  However,  it  should  be  pointed  out  that  four  of  the  five  training 
categories  loaded  highest  on  the  first  factor,  the  exception  being 
formal  unit  training  which  defined  the  positive  end  of  the  second  factor. 


The  results  of  the  regression  analyses  Indicated  that  the  four  factors 
were  most  efficient  in  predicting  resident  school  training.  The  resulting 
multiple  correlation  was  .85  compared  with  the  multiple  correlation  of 
.88  obtained  for  the  five-point  type  of  training  scale.  The  lowest 
multiple  correlation  was  obtained  in  predicting  formal  unit  training. 

This  could  very  well  be  due  to  the  lack  of  a  common  definition  of  this 
variable  among  the  raters.  Again,  perhaps  for  the  same  reason,  the  next 
lowest  multiple  correlation  was  obtained  when  non-resident  school 
training  was  used  as  the  criterion. 

Examination  of  the  matrix  showing  the  correlations  among  the  different 
predicted  types  of  training  (shown  in  Table  3)  reveals  that,  generally, 
the  correlation  between  the  predicted  formal  unit  trailing  and  other 
predicted  types  of  training  is  lowest.  The  correlations  in  this  table 
support  the  concept  that  whatever  strategy  is  employed  in  selecting  a 
criterion  against  which  to  validate  the  four  factor  model  ,  the  utility 
of  the  model  in  defining  training  priorities  is  upheld  with  this  one 
exception. 
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UTILIZATION  OF  DIFFERENTIAL  PROFICIENCY  LEVELS  FOR 
CRITERION- REFERENCED  TRAINING  SYSTEM  ASSESSMENT 


John  B.  Meredith,  Jr. 
Data-Design  Laboratories 
Norfolk,  Va.  23505 


INTRODUCTION 


In  designing  criterion-referenced,  multiple-choice  tests,  one 
of  the  most  perplexing  problems  is  the  determination  of  the 
passing  score.  Either  passing  scores  are  arbitrarily  set  at 
some  percent  correct  (Sanders,  1976;  Shaycoft,  1976)  or  they 
are  determined  by  complex  mathematical  methods  that  incorpor¬ 
ate  a  and  8  classification  errors  (Emrick,  1971;  Kriewall, 

1972;  Hively  et  al,  i.973;  Millman,  1972  and  1973;  Roundabush, 
1974;  and  Wilcox,  1977).  An  a  classification  error  occurs 
when  a  nonmaster  is  falsely  deemed  to  be  a  master;  conversely 
a  0  classification  error  occurs  when  a  master  is  falsely 
deemed  to  be  a  nonmaster  (Meskauskas,  1976).  In  most  of  the 
previous  studies,  the  methods  for  determining  passing  scores 
were  either  too  simplistic,  thereby,  resulting  in  large  class¬ 
ification  errors  (Reichman  and  Oosterhoff,  1976,  or  requiring 
complex  parameter  estimation  procedures  (Wilcox  and  Harris, 

1977) . 

In  this  study  an  approach  for  determining  passing  scores  devel¬ 
oped  by  Nedelsky  (1954)  is  utilized.  This  approach  involves 
the  use  of  Subject  Matter  Experts  (SMEs)  to  define  the  minimal 
performance  level  for  a  test  by  rating  the  difficulty  of  each 
alternative  to  each  test  item  for  the  minimally  acceptable 
(just  passinc?)  examinee  (Meskauskas,  197S)  .  Thus,  this  pro¬ 
cedure  establishes  item  content  rather  than  examinee  perfor¬ 
mance  as  tne  basis  for  determining  item  difficulty  (Smilansky 
and  Guerin,  1976)  .  This  approach  for  setting  passing  scores 
provides  one  of  the  best  estimates  of  the  probability  of  class¬ 
ifying  examinees  into  correct  mastery  or  nonmastery  states 
(Reichman  and  Oosterhoff,  1976) 

When  evaluating  a  heterogeneous  group  of  personnel  with  varying 
experience,  the  comparison  of  experienced  personnel  to  inex¬ 
perienced  personnel  with  a  single  passing  score  (proficiency 
level)  is  not  appropriate.  This  type  of  comparison  would  pose 
a  serious  threat  to  the  external  validity  by  questioning  the 
generaiizability  of  the  results  to  the  entire  population  (Bracht 
and  Glass,  1968) . 
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To  overcome  this  external  threat  to  the  validity  of  the  find¬ 
ings,  multiple  passing  scores  can  be  determined  according  to 
various  levels  of  personnel  experience.  In  this  study,  exper¬ 
ience  was  defined  at  three  levels  based  upon  personnel  watch 
station  qualifications.  These  three  levels  were: 

1.  Apprentice  Technician  {replacement  school  graduate) 

?.  Journeyman  Technician  (qualified  watchstander) 

3.  Master  Technician  (qualified  watch  supervisor) 

From  these  multiple  proficiency  levels,  Minimally  Acceptable 
Performance  Levels  (MAPLs) ,  were  determined  by  proficiency 
level  for  each  area  of  a  test.  The  purpose  of  this  study  was 
to  determine  the  extent  to  which  replacement  and  advanced 
training  curricula  produced  effectively  trained  technicians. 

METHOD 

Fedelsky  (1954)  developed  the  technique  as  ar  "absolute  stan¬ 
dard"  for  evaluation  of  physics  students  on  a  departmental, 
multiple-choice  comprehensive  examination  at  the  University 
of  Chicago.  The  technique  was  validated  by  Taylor  and  Reid 
(1972),  Bobula  (1974),  Sm.ilansky  and  Guerin  (1976),  and  Mer¬ 
edith  (1977).  The  use  of  this  technique  is  dependent  upon 
the  assumption  that  SMEs  can  define  alternative  similarity 
as  follows: 

1.  An  alternative  which  a  minimally  acceptable  examinee 
should  recognize  as  incorrect  is  given  a  value  of 
zero  (0) . 

2.  An  alternative  which  a  minimally  acceptable  examinee 
should  not  recognize  as  incorrect  is  given  a  value 
of  two  IT T. 

3.  An  alternative  which  is  correct  is  given  a  value  of 
two  ( 2 ) . 

4.  All  other  alternatives  are  given  values  of  one  (1). 

An  example  of  the  application  of  this  technique,  adapted  from 
Bobula  (1974),  is  exemplified  by  the  faculty  member  who  is 
teaching  statistics  and  defines  ability  to  recognize  measures 
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of  central  tendency  as  4  component  of  basic  statistical  compe¬ 
tency.  He  might  write  an  item  <:<rd  assign  the  following  values 


Value 


"The  most  appropriate  measure 
of  central  tendency  for  the 
reading  performance  scores 
of  a  large  class  is  the  ..." 

(2) 

1. 

Mode 

(0) 

2. 

Variance 

(2) 

3. 

Mean 

(2) 

4. 

Median 

(0) 

5. 

Standard  Deviation 

When  the  SME  rated  this  item  for  the  "minimally  acceptable" 
examinee,  he  decided  that  three  alternatives  (Mean,  Median, 
and  Mode)  were  equally  viable,  and  all  were  given  values  of 
two  (2) ,  The  SME  rated  two  alternatives  (Variance  and  Stan¬ 
dard  Deviation)  as  not  meeting  the  minimal  component  of  basic 
statistical  competency,  and  both  were  given  values  oi  zero  (0) 


The  Alternative  Similarity  Index  (ASI)  for  the  ifch  multiple- 
choice  test  item  at  a  given  proficiency  level  is  defined  as 
follows: 


Where: 


n  = 
m  ■ 
A  * 


ASIi 


2n 

n  m 


(1) 


l  Z  A 
j-1  K-l 

number  of  SMEs, 

number  of  alternatives,  and 

value  assigned  to  alternative  by  the  nfch 
SME. 
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In  the  preceding  statistical  example,  the  ASI  was  calculated 
by  equation  (1)  as  follows: 


ASI  ** 


2(1) 

I - 5 

Z  Z 

j«l  k»l 


A 


_ 2 _  2 

S  +  0  +  2  +  2  +  0  *  6 


.33 


The  MAPL  for  each  area  of  specialization  is  determined  by  sum¬ 
ming  the  ASIs  at  a  given  proficiency  level  over  the  number  of 
items  within  the  area.  For  each  area  of  specialization  on  the 
test,  there  will  be  three  MAPLs ,  one  at  each  proficiency  level. 
Any  technician  at  a  given  proficiency  level,  whose  area  raw 
score  exceeds  the  respective  MAPL,  is  deemed  to  be  a  master; 
conversely,  any  technician  at  a  given  proficiency  level,  whose 
area  raw  score  does  not  exceed  the  respective  MAPL,  is  deemed 
to  be  a  nonmaster.  A  technician  may  only  be  considered  a 
master/nonmaster  at  one  proficiency  level.  That  is,  techni¬ 
cian  watch  station  classifications  are  independent  and  mutu¬ 
ally  exclusive. 

At  each  proficiency  level,  the  technician's  raw  scores  were 
transformed  to  standardized  area  scores.  This  transformation 
was  appropriate  in  order  to  meet  the  assumptions  of  an  analy¬ 
sis  of  variance  design  (Cochran  and  Cox,  1957).  The  trans¬ 
formation  was  based  upon  a  standardized  area  criterion  score 

(Zj  •)  for  each  technician  and  is  defined  as  follows: 

1  *  j 


Xifj  -  MAPLifj 
i.j  si 


(2) 


Where:  xi»j  =  raw  examinee  score  for  it^  area,  jth 

*  proficiency  level. 

MAPL^  j  =  MAPL  score  for  it^  area,  profici- 
* "  ciency  level. 

«=  standard  deviation  of  raw  scores  for  ifc^ 
area. 

From  standardized  area  criterion  scores,  a  factorial  analysis 
of  variance  was  utilized  to  test  differences  among  areas  and 
proficiency  levels. 
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This  analytical  design  was  a  pure  Model  I,  where  both  effects 
(areas  and  proficiency  levels)  were  fixed.  This  is  represent¬ 
ed  by  (Sokal  and  Rohlf,  1969): 


yijk  -  U  +  <*i  +  0j  +  (®B)  ij  +  eijk 


Where: 


Yijk 


<«*>ij 


cijk  * 


kfch  technician  representing  the  ifc^  area 
and  the  proficiency  level. 

parametric  mean  of  technician  population. 

fixed  treatment  effect  for  ifch  area. 

fixed  treatment  effect  for  jfch  profici¬ 
ency  level. 

interaction  effect  between  ifch  area  and 
jth  proficiency  level. 

error  term  of  kfch  technician  at  the  jth 
proficiency  level  for  the  ith  area. 


The  null  hypothesis  of  interest  in  this  study  was  that  there 
would  be  no  significant  difference  among  areas. 

H0;  Ua  *  Ub  “  • • •  ®  Mi 

When  a  significant  difference  is  found  among  areas,  a  post  hoc, 
multiple  comparison  test  is  utilized  to  determine  which  area(s) 
differ  from  the  other  areas.  Any  area  that  is  significantly 
below  the  other  areas  may  indicate  undertraining.  Conversely, 
any  area  that  is  significantly  above  the  other  areas  may  indi¬ 
cate  overtraining. 

FINDINGS 

In  this  study,  a  total  of  17  SMEs  evaluated  a  260-item  test 
that  va?  composed  of  12  areas  of  specialization  (A  through  L) . 
Each  SME  evaluation  was  used  only  at  one  proficiency  level. 

In  that,  12  SMEs  were  used  to  determine  the  Apprentice  Techni¬ 
cians'  MAPL  for  each  area,  2  SMEs  were  used  to  determine  the 
Journeyman  Technicians'  MAPL  for  each  area,  and  3  SMEs  were 
used  to  determine  the  Master  Technicians'  MAPL  for  each  area. 

This  test  was  administered  to  Navy  submarine  technicians 
(N=317).  The  technicians  were  subsequently  divided  into 
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three,  mutually  exclusive,  watch  station  categories.  From  each 
of  the  three  watch  station  categories,  28  technicians  were  ran¬ 
domly  selected  and  utilized  in  the  analysis.  The  number  of 
technicians  randomly  selected  was  based  upon  the  power  of  the 
analysis  (1-8)  and  was  determined  to  be  .95  (Cochran  and 
Cox,  1957). 

The  area  MAPLs  by  proficiency  level  are  presented  in  Table  1. 


Table  1. 


Area  MAPLs  by  Proficiency  Level 


Area 

#  Item.. 

In  Area 

Apprentice 

Technician 

Journeyman 

Technician 

Master 

Technician 

A 

11 

4.994 

5.577 

6.908 

B 

18 

8.478 

11.124 

11.430 

C 

18 

9.612 

10.350 

11.250 

D 

41 

21.279 

23.944 

27.429 

E 

26 

13.936 

13.936 

16.458 

F 

20 

10.700 

10.700 

13.160 

G 

20 

11.160 

11.160 

13.000 

H 

20 

11.000 

11.000 

11.000 

I 

21 

9.870 

11.340 

13.230 

J 

21 

11.067 

11.067 

12.810 

K 

18 

9.774 

9.774 

10.296 

L 

26 

16.302 

16.302 

17.082 

NOTE:  For  purposes  of  this  study,  an  additional  constraint 
was  imposed.  This  constraint  was  that  neither  Jour¬ 
neymen  nor  Master  Technicians'  area  MAPL  may  be  lower 
than  the  Apprentice  Technician  MAPL.  Also,  Master 
Technicians'  aiea  MAPL  may  not  be  lower  than  the  Jour 
neyman  Technician  area  MAPL. 
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The  null  hypothesis  of  no  difference  among  areas  was  rejected. 
The  analysis  of  variance  summary  is  presented  in  Table  2. 


Table  2. 

Analysis  of  Variance  Summary 


Sourca 

Variation 

Sum  of 
Square* 

Daqr*** 
of  Praadoa 

Exp*  cf.ad 

Naan  Squara 

Naan 

Squara 

r-Valua 

f,evel  of 
Significance 

Ajaong  Area* 

ICO. 420 

11 

o*  ♦  Kb  la! 

•-1 

14.5S4 

1C. 331 

.000000005 

Amotxj  Watch- 
Statlon* 

t.nt 

2 

b-l 

.*14 

1.324 

.35*5 

Araaa  X  Watch- 
Station* 

43.599 

22 

2.191 

3.237 

.0000001 

Error 

SCI. 350 

972 

*> 

.1*3 

Total  1,0*4,197  1,007 


A  lost  hoc,  multiple  comparison  test  (Newmar.-Keuls)  was  per¬ 
formed  to  determine  which  area(s)  differed  significantly  from 
the  other  areas.  This  was  appropriate  since  the  area  effects 
from  the  analysis  of  variance  were  highly  significant.  Since 
a  highly  significant  interaction  between  areas  and  watchsta- 
tions  was  also  found,  the  interpretations  of  the  area  effects 
become  more  complex.  With  a  highly  significant  interaction 
effects,  global  training  assessments  based  upon  area  effects 
alone  would  probably  be  misleading.  Therefore,  graphical  an¬ 
alyses  of  mean  area  scores  are  presented  in  Figures  1,  2,  3, 
and  4  to  facilitate  interpretations  of  the  results  (Winer, 
1971). 

From  the  graphical  analyses,  the  following  findings  are  re¬ 
ported  by  area; 

1.  In  area  A,  all  technician  groups  were  above  expecta¬ 
tions.  (This  area  dealt  with  casualty  procedures 
which  are  very  thoroughly  taught.) 

2.  In  area  B,  apprentice  technicians  performed  signifi¬ 
cantly  higher  than  expected  when  compared  to  the 
other  technician  groups.  This  may  indicate  the  pos¬ 
sibility  of  overtraining  in  this  area  for  initial 
replacement  training. 
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3.  In  area  C,  all  technician  groups  performed  somewhat 
above  expectations.  This  gives  an  indication  that 
technician  performance  is  satisfactory  (above  expec¬ 
tations)  . 

4.  In  area  D,  all  technician  groups  performed  somewhat 
below  expectations.  However#  master  technicians  per¬ 
formed  significantly  '  elow  expectations.  This  may 
indicate  that  refresher  courses  (advanced  training) 
may  be  needed  for  master  technicians. 

5.  In  area  E#  all  technicians  groups  consistently  per¬ 
formed  as  expected. 

6.  In  area  F,  all  technician  groups  performed  adequately. 
However#  a  large  variance  among  technician  groups 
requires  further  investigation. 

7.  In  area  G,  all  technician  groups  performed  below  the 
expected  performance  levels.  The  apprentice  techni¬ 
cians  performed  at  such  a  level  which  may  indicate  a 
lack  of  training. 

8.  In  area  H,  journeyman  and  master  technician  groups 
performed  as  expected.  The  apprentice  technician 
group  performed  at  such  a  level  which  may  indicate 
undertraining. 

9.  In  area  I#  apprentice  technicians  performed  as 
expected  with  the  journeyman  technician  group  per¬ 
forming  somewhat  below  expectations.  The  master 
technician  group#  however,  performed  significantly 
below  expectations  which  may  indicate  a  serious 
undertraining  problem. 

10.  In  area  J,  all  technician  groups  performed  below 
expectations  which  may  indicate  a  general  under¬ 
training  trend  in  this  area  of  specialization. 

11.  In  area  K,  master  technicians  performed  somewhat 
below  expectations.  The  apprentice  and  journeyman 
technician  groups,  however,  performed  significantly 
below  expectations  which  may  indicate  a  general 
undertraining  in  this  area  of  specialization. 

12.  In  area  L,  all  technician  groups  performed  signifi¬ 
cantly  below  expectations.  This  may  indicate  serious 
undertraining  in  this  area  of  specialization. 
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CONCLUSIONS 

This  paper  has  extended  a  methodology  for  making  preliminary 
training  assessments  based  upon  criterion-referenced,  multi¬ 
ple-choice  tests  from  the  use  of  a  single  proficiency  level 
(Meredith,  1977)  to  multiple  proficiency  levels.  The  exten¬ 
sion  to  multiple  passing  scores  has  resulted  in  a  more  pen¬ 
etrating  assessment  of  training  because  technicians  of 
varying  experience  were  evaluated  at  their  experience  level. 
To  compare  experienced  technicians  to  a  criterion  based  upon 
unexperienced  technicians  or  vice-versa  may  be  inappropriate. 
With  the  use  of  multiple  proficiency  levels,  evaluators  are 
able  to  fine  tune  a  training  system  to  the  specific  needs  of 
various  levels  of  technicians. 


The  limitation  of  the  multiple  MAPL  procedure  is  that  addi¬ 
tional  number  of  SMEs  are  required  for  evaluating  the  items  in 
multiple-choice  tests. 


One  solution  to  this  problem  is  to  have  each  SME  evaluate 
items  for  more  than  one  technician  proficiency  level.  In 
this  study,  this  was  determined  to  be  inappropriate  due  to 
a  carry  over  effect  from  one  proficiency  level  to  another 
by  SMEs,  thus  imposing  an  additional  assumption  of  local  in¬ 
dependence  among  SMEs. 


The  education  implications  of  this  study  may  be  in  the  deter¬ 
mination  of  minimal  performance  criteria  (passing  scores)  for 
a  test  or  test  part.  This  may  be  extremely  useful  for  those 
of  us  involved  with  competency-based  education.  In  particu¬ 
lar,  educators  could  evaluate  the  effectiveness  of  their 
system  et  various  points  in  the  curriculum  while  at  the  same 
time  evaluating  the  product  (students).  In  this  manner, 
educators  could  receive  quantitative  data  from  which  adjust¬ 
ments  could  be  made  in  the  relative  emphasis  of  their  pro¬ 
grams.  This  could  result  in  a  better  attuned  system  that  will 
meet  the  needs  of  the  students  and  will  result  in  a  more  effi¬ 
cient  allocation  of  limited  resources. 
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Relating  Task  Surveys  to 
the  Content  of  Existing  Training  Programs 


Harry  L.  Ammerman,  Ph.D. 

The  Center  for  Vocational  Education 
Columbus,  Ohio 


During  the  many  years  that  a  number  of  us  have  conducted  research 
on  military  technical  training  and  testing,  we  often  realized  our  re¬ 
sults  also  had  potential  for  civilian  applications,  how  I  have  the 
opportunity  to  reverse  that  circumstance,  by  describing  a  methodology 
developed  in  the  public  education  sector  that  may  have  a  usefulness  in 
military  contexts.  I  hope  you  find  It  of  Interest  and  value. 

It  Is  taken  for  granted  that  most  of  you  are  aware  of  the  recent 
emphasis  upon  the  validation  of  employee  selection  tests,  as  put  forth 
In  Federal  Executive  Agency  Guidelines.  Related  to  this  emphasis,  the 
1976  federal  court  decision  in  the  case  of  Washington  vs.  Davis  begins 
to  highlight  a  concern  for  the  content  validity  of  training  programs, 
potentially  subjecting  curriculum  content  (as  well  as  the  content  of 
achievement  tests  based  on  that  training)  to  standards  of  validation 
comparable  to  those  standards  being  Imposed  upon  employee  selection 
tests. 

The  basis  for  such  validation  of  training  content  already  exists 
for  many  of  you  In  your  task  Inventories  and  surveys  of  occupational 
performance.  By  these  means,  and  by  related  methods  used  in  the  engi¬ 
neering  of  instructional  systems,  there  generally  is  produced  an 
identification  of  what  work  tasks  are  relevant  to  a  defined  occupation. 
There  also  Is  some  selection  of  which  tasks  are  appropriate  for  formal 
school  training,  as  well  as  some  specification  of  what  task  content  and 
performance  standards  are  to  be  of  concern  in  school  training  for  each 
occupation. 

A  problem  can  arise  at  this  point  when  the  results  of  these  front- 
end  analyses  are  to  be  compared  to  the  content  of  an  existing  training 
program,  as  might  be  done  to  see  If  there  are  any  significant  discrepan¬ 
cies  between  the  two.  This  comparison  Is  not  too  difficult  when  the 
content  of  the  training  program  is  given  In  the  same  terms  as  the 
occupational  task  survey  results;  that  is,  in  terms  of  specific  tasks 
and  of  the  knowledges,  skills,  and  proft  :ienc1es  associated  with  each 
task. 
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However,  an  existing  curriculum  may  happen  to  be  stated  In  a  some¬ 
what  different  form,  making  comparisons  very  difficult.  A  particular 
program  might  state  Its  training  content  in  such  other  forms  as  these: 

1.  Listing  or  outlining  the  topics,  concepts,  generalizations, 
system  components,  or  other  such  elements  of  knowledge  con¬ 
tent  to  be  dealt  with. 

1.  Descriptions  of  what  instructors  are  to  do,  or  the  activities 
to  be  carriec  on  by  the  teachers. 

3.  Stated  as  highly  generalized  patterns  of  behavior,  noting  the 
general  kinds  of  changes  In  students  with  which  the  program  is 
intended  to  deal. 

4.  Student  performance  objectives  stated  in  terms  of  school- 
related  behavior. 

5.  Listing  of  the  title:  of  available  instructional  courses,  as 
in  a  college  catalog. 

6.  identification  of  particular  textbooks,  student  workbooks, 
teaching  aids,  laboratory  exercises,  and  other  Instructional 
resources  to  be  used. 

7.  Test  Items  which  convey  the  intended  areas  of  ability  develop¬ 
ment  or  learning  attainment. 

When  an  existing  curriculum  is  described  only  in  one  or  more  of 
these  other  ways,  the  relation  of  that  curriculum  to  job  performance 
content  derived  by  task  survey  procedures  Is  not  readily  apparent.  It 
would  be  helpful  If  present  curriculum  content  #.*ere  convertable  tu  a 
form  more  similar  to  occupational  survey  results. 

Defining  Curriculum  Content 

Let  me  pause  at  this  point  to  define  what  is  meant  here  by  the 
term  "curriculum  content,"  and  to  suggest  the  key  variables  to  be  used 
in  identifying  curriculum  content. 

The  concept  of  "cunlculum"  is  considered  to  mean  the  "intended 
learning  outcomes"  that  have  been  selected  and  ordered.  This  view  of 
curriculum  as  being  a  product  that  states  "what  is  to  be  learned"  by 
a  student  is  based  upon  Mauri tz  Johnson's  1967  and  1969  definitive 
considerations  in  curriculum  theory,  wherein  he  distinguishes  between 
the  concepts  of  "curriculum"  and  "instruction." 
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1  Instruction  becomes  the  process  by  which  intended  learning  outcomes 

:  are  achieved.  "Curriculum,"  on  the  other  hand,  is  the  planned  and 

\  structured  series  of  those  Intended  learning  outcomes.  Thus,  the  basic 

distinction  Is  made  here  between  (a)  what  is  to  be  learned  (that  is, 
the  curriculum)  and  (b)  how  such  learning  is  to  be  attained  (the 
instruction). 

Curriculum  content"  is  identified  on  the  basis  both  of  its  Intended 
■nclusion  and  its  emphases  In  a  training  program.  The  use  of  content 
Inclusion  and  emphasis  as  the  key  variables  of  curriculum  content  is 
based  on  the  conclusions  of  Decker  Walker  and  Jon  Schaffarzlck  from 
their  extensive  study  in  1974  of  what  are  the  important  influences  on 
student  learning  achievement. 

These  features  of  content  Inclusion  and  content  emphasis  are  what 
we  have  attempted  to  operationalize.  This  has  been  done  in  an  Instru¬ 
ment  to  be  applied  to  an  existing  training  program  for  the  purpose  of 
identifying  what  Is  the  job  performance  content  that  constitutes  the 
planned  and  intended  outcomes  of  that  training.  The  Instrument, 
patterned  as  a  task  questionnaire,  identifies  the  intended  curriculum 
content  in  a  manner  that  Improves  our  ability  to  make  direct  comparison 
with  task  survey  results  or  other  such  front-end  analyses  resulting  from 
instructional  system  design  efforts. 

"Content  inclusion,"  as  that  concept  Is  operationalized  here,  is 
concerned  with  whether  each  particular  task  of  an  occupation  is  or  Is 
not  intended  to  receive  some  consideration  in  the  training  program. 
"Content  emphasis"  Is  concerned  with  the  level  of  development  of  per- 
formance  ability  that  is  Intended  In  the  school  training.  This  Indicates 
PEfirtEC  of  task  emphasis . 

Additionally,  "content  emphasis"  can  pertain  to  AREA  of  task 
emphasis,  where  particular  non-performance  features  of  a  task'  are 
especially  Important  and  attended  to  in  the  training  process. 

Curriculum  Content  Questionnaire 

These  three  identifiers  of  curriculum  content  form  the  basis  for 
task  questions  used  In  a  Curriculum  Content  Questionnaire  that  is 
administered  to  school  personnel  who  are  most  knowledgeable  about 
what  learning  is  Intended  in  a  particular  training  program.  This 
questionnaire  Is  thj  means  by  which  curriculum  representatives  who 
are  knowledgeable  of  planned  program  content  may  indicate  the  nature 
and  emphases  of  job  content  existing  in  a  curriculum. 


127? 


■*-  **■*?•"  '  ■<*.  -  ^3 


WM  &P**$GS$&5JS®!3 


The  questionnaire  is  similar  In  format  to  a  Task  Inventory  Ques¬ 
tionnaire. 


figure  1.  Format  of  Curriculum  Content  Questionnaire 

As  shown  in  Figure  1,  it  consists  of  a  column  listing  the  tasks 
of  an  occupation,  followed  by  two  special  task  questions.  The  first 
question  seeks  to  identify  both  task  Inclusion  In  training  and  Its 
degree  of  emphasis  with  respect  to  the  level  of  task  ability.  The 
second  question  probes  for  task  areas  that  are  especially  important 
and  Intended  to  be  emphasized  In  the  training  relevant  to  each  task. 

The  Appendix  contains  complete  directions  and  explanation  of  the 
response  categories  for  each  task  question.  Here  let  me  show  abbrevi¬ 
ated  versions  of  each  response  scale. 

level  of  Task  Development 

The  first  question  asks  each  respondent  to  rate  the  extent  that 
the  curriculum,  during  the  training  program,  deliberately  plans  to 
develop  task  proficiency.  Eight  levels  of  task  development  are 
possible,  with  the  levels  being  defined  In  a  manner  similar  to  that 
used  by  John  H*>mph111  (I960)  on  his  scaling  of  the  job  significance 
of  tasks: 


0  NO  DEVELOPMENT  of  the  task  is  Intended 


1  Oevelop  only  a  GENERAL  AWARENESS  of  the  task 

2 

3 

4  Develop  s  BASIC  ABILITY  to  perform  th?  task 

5 

6 

7  Develop  a  HIGH  PROnCiH'ICt 

In  the  perf»ri?uw.«  of  tre  task 

Responses  other  than  *0"  would  be  used  to  1  no 1  cate  some  degree  of  Inclu¬ 
sion  of  a  task  In  the  planned  learning.  This  inclusion  could  then  range 
from  a  minimal  general  awareness  of  the  task  to  development  of  very  high 
proficiency  in  performing  the  task.  The  midpoint  of  level  "4"  depicts 
a  basic  ability  to  do  the  task,  but  implies  no  special  Intent  that  any 
advanced  speed,  accuracy,  or  excellence  of  task  performance  be  developed. 
This  Is  typically  the  most  frequently  used  category,  with  level  "7"  being 
next  most  frequent.  Levels  higher  than  "4"  represent  more  advanced  lev¬ 
els  of  skill  development  with  Increasingly  higher  standards  of  speed, 
accuracy,  or  excellence  of  task  performance. 

Table  1  portrays  the  percent  of  times  raters  used  each  response 
category  when  the  Curriculum  Content  Questionnaire  was  applied  to  train¬ 
ing  programs  in  three  different  occupational  areas. 
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Table  1 


Percent  of  Category  Usage 
On  level  of  Development  Scale 


Scale  Category 

Mechanic 

Training  Program 

Programmer  Secretary 

0  - 

NO  DEVELOPMENT 

12% 

19% 

11% 

1  - 

GENERAL  AWARENESS 

11 

6 

2 

6 

4 

5 

3 

4 

6 

6 

4  - 

BASIC  ABILITY  TO  DO 

18 

21 

27 

5 

11 

7 

17 

6 

14 

10 

13 

7  - 

VERY  HIGH  PROFICIENCY 

24 

16 

IS 

Missing  Data 

1 

4 

0 

Nine  raters  were  used  for  each  type  of  training  program,  with  180  tasks 
rated  In  each  occupation.  This  type  of  scale  helps  distinguish  between 
different  levels  of  task  proficiency  by  stretching  ratings  of  developed 
task  performance  over  four  categories,  levels  "4"  through  "7".  Ratings 
for  eauh  task  are  turonarized  across  raters  by  taking  the  mean  value  of 
the  0  -  7  scale.  Mean  values  above  3.0  generally  Indicate  that  some 
amount  of  task  training  is  In  fact  planned  for  the  curriculum.  However, 
no  one  precise  mean  rating  was  found  to  accurately  designate  the  point 
that  differentiated  between  "no  development"  and  "some  development" 
Intended  in  training.  This  distinction  appears  tol>e  Identified  better 
by  other  means,  using  survey  data  from  workers  and  supervisors  on 
questions  of  task  occurrence  and  significance.  However,  the  higher  the 
value  for  level  ot  development,  the  more  likely  the  Intent  to  Include 
the  task  in  training. 
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Interrater  reliability  for  the  nine  judges  over  180  tasks  in  each 
occupation  was  .88  for  Mechanics,  .78  for  Programmers,  and  .83  for  Secre¬ 
taries.  These  reliabilities  are  adjusted  for  mean  differences  among 
raters,  and  were  calculated  using  Ben  Winer's  analysis  of  variance  ;;ro- 
cedure  (1971). 

It  is  of  interest  to  note  the  obtained  relationship  between  this 
level  of  development  scale  and  other  questions  often  used  on  Task  Inven¬ 
tory  Questionnaires.  We  had  some  task  survey  data  available  from  120 
workers  In  each  occupation.  Their  mean  responses  were  correlated  against 
mean  levels  of  task  development,  as  shown  in  Table  2. 

As  can  be  seen  In  this  table,  task  survey  data  for  Mechanics 
tended  to  correlate  quite  high  with  intended  levels  of  development. 
However,  for  the  less  prescribed,  less  routine  types  of  occupations, 
these  correlations  dropped  off  considerably,  though  all  retained  statis¬ 
tical  significance. 


Table  2 


Correlations  Between  Task  Survey  Data 
And  Level  of  Development  Scale 


Training  Program 

Task  Survey  Measure 

Mechanic 

Prograroner 

Secretary 

.  Percent  of  Workers  Performing 

Each  Task 

-89 

.76 

.61 

.  Frequency  Of  Task  Performance: 

Based  On  All  Workers  Surveyed 

.33 

.76 

.63 

Based  Only  On  Workers  Who 

Perform  The  Task 

.32 

.58 

.  Relative  Proportion  Of  Time  Spent 

.77 

.30 

.41 

.  Extent  Task  Is  Part  Of  The  Job 

(Hemphill  scale  of  significance) 

.91 

.72 

.59 
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Areas  of  Emphasis 

Now,  turning  to  the  second  question  on  the  Curriculum  Content 
Questionnaire,  we  have  postulated  11  task  areas  that  might  be  especially 
important  fot*  emphasis  in  task  training.  These  areas  represent  matters 
other  than  performance  speed  and  accuracy,  since  such  veatures  are  al¬ 
ready  Identified  by  the  level  of  development  question. 

Four  of  the  areas  pertain  to  various  aspects  of  the  job  content; 
two  pertain  to  personal  matters,  while  the  remaining  five  pertain  to 
technical  knowledge  and  skill  areas.  Each  of  these  is  accompanied  by  a 
bi'ief  definition  in  the  questionnaire  (see  Appendix).  Raters  are  asked 
to  Indicate  which.  If  any,  areas  other  than  performance  ability  are 
especially  emphasized  In  training  for  each  task. 

Usage  of  the  several  categories  of  emphasis  varied  among  the  three 
occupations  in  which  we  administered  the  Curriculum  Content  Questionnaire, 
though  Technical  Knowledge  (Category  9)  predominated. 

The  distribution  in  Table  3  again  reflects  nine  raters  and  180  tasks 
per  occupational  area. 


Table  3 

Percent  Of  Category  Usage 
For  Areas  Of  Training  Emphasis 


Training  Program 

Area  Of  Emphasis 

Mechanic 

Programmer 

Secretary 

1  -  Order,  Timing 

9% 

10X 

12* 

2  -  Value  Purpose 

1? 

18 

4 

3  -  Safety 

8 

1 

2 

4  -  Varied  Conditions 

5 

1 

9 

5  -  Relating  To  Others 

3 

4 

11 

6  -  Attitude,  Responsibility 

9 

10 

17 

7  -  Basic  Education 

4 

8 

6 

8  -  Detect  Discrepancies 

15 

7 

8 

9  -  Technical  Knowledge 

18 

25 

24 

10  -  Job  Aids 

11 

8 

5 

11  -  Alternate  Methods 

5 

8 

3 
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The  figures  showing  1QX  or  more  of  the  responses  are  highlighted  by 
circles  on  tMs  table.  Sensitivity  to  the  value  and  importance  of  tasks 
was  emphasized  more  often  for  Mechanics  and  Programmers  than  for  Secre¬ 
taries',  whereas,  secretarial  '.raining  more  often  emphasized  the 
development  of  pride  in  work  done  and  their  feelings  toward  doing 
quality  work,  though  I  hasten  to  add  that  these  matters  were  not  totally 
neglected  for  the  other  two  occupations.  Raters  of  Mechanic  training 
programs  tended  to  mark  twice  as  many  areas  for  emphasis  than  did  raters 
for  the  other  two  occupations. 

These  patterns  of  actual  distributions  of  rater  use  of  the  emphasis 
categories  are  modified  somewhat  when  the  responses  are  summarized 
across  nine  raters. 

As  it  turned  out,  not  all  180  tasks  per  occupation  were  intended  for 
training,  leaving  some  for  on-the-job  experience  and  learning.  Areas  for 
each  task  were  summarized  by  noting  simply  where  four  or  more  of  the  nine 
raters  agreed  on  the  same  area  of  emphasis.  The  results  are  shown  in 
Table  4. 

In  the  distribution  that  results  from  looking  only  where  such  a 
level  of  agreement  existed,  we  find  that  Mechanics  Increased  their  pro¬ 
portions  for  task  Value  and  Safety,  as  well  as  for  Detecting  Discrepancies 
an^  f)r  Technical  Knowledge.  All  other  areas  decreased  their  proportions 
in  comparison  to  raw  frequencies  of  category  usage.  For  Programmers, 
areas  of  Worker  Responsibility  and  Technical  Knowledge  Increased  their 
proportions,  with  other  areas  decreasing.  For  Secretaries  the  increases 
in  proportion  were  evident  in  the  areas  of  Relating  to  Others  and  Techni¬ 
cal  Knowledge. 

As  on  aside,  it  was  Interesting  to  note  that  when  a  comparable  ques¬ 
tion  was  asked  of  employers  regarding  their  expectations  for  trained 
graduates,  (Ammerman  &  Essex,  1977),  the  Basic  Education  area  was 
more  evident  for  Mechanics  and  Secretaries.  The  Mechanic  areas  of 
Safety,  Detecting  Discrepancies,  and  use  of  Job  Aids  were  much  less 
evident  in  employer  expectations,  as  compared  to  training  Intentions. 
However,  Interpreting  such  differences  between  training  Intentions  and 
employer  expectations  goes  beyond  the  scope  of  this  paper. 
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Table  4 

Number  Of  Tasks  Per 
Area  Of  Training  Emphasis 


Area  Of  Emphasis 

Mechanic 

Training  Program 

Programmer  Secretary 

1  -  Order,  Timing 

1 

• 

4 

2  -  Value,  Purpose 

17 

i  r 

3 

• 

3  -  Safety 

17 

• 

• 

4  -  Varied  Conditions 

• 

• 

• 

5  -  Relating  To  Others 

1 

• 

10 

6  -  Attitude,  Responsibility 

6 

3 

9 

7  -  Basic  Education 

• 

• 

1 

8  -  Detect  Discrepancies 

36 

1 

3 

9  -  Technical  Knowledge 

53 

16 

31 

10  -  Job  Aids 

10 

• 

♦ 

11  -  Alternate  Methods 

1 

1 

• 

Emphasis  Totals 

142 

24 

58 

Summarizing  The  Results  To  Show  Training  Intentions 

The  results  from  the  two  questions  of  the  Curriculum  Content  Ques¬ 
tionnaire,  as  applied  to  three  different  types  of  occupational  training 
programs  that  precede  employment  In  those  jobs,  are  Illustrated  In 
Tables  5  and  6. 
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Table  5 


Results  For  Intended  Task  Development 


Training  Program 

Mechanic 

Programmer 

Secretary 

No  Development  Intended 

52 

104 

51 

Some  Development  Intended 

128 

76 

129 

Level  Of  Task  Development: 

Other  Than  Ability  To  Perform 

7 

•7 

C 

Basic  Ability  To  Do  Task 

20 

28 

44 

Performance  Ability  Plus 

Standards  Of  Speed, 

Accuracy,  And/Or  Excellence: 

Level  5 

Increasing 

59 

23 

53 

Level  6 

f  Levels  Of 
Proficiency 

42 

20 

21 

Level  7  J 

0 

2 

3 

Of  the  180  tasks  listed  in  each  survey,  no  development  of  29%  of 
the  Mechanic  tasks  was  intended,  the  same  for  58%  of  the  Progrfianver 
tasks  and  28%  of  the  Secretarial  tasks.  Comparatively  little  training 
of  Programmer  tasks  was  Intended  prior  to  employment,  with  less  than 
half  the  tasks  that  were  relevant  to  the  job  being  selected  for  train¬ 
ing.  This  small  number  of  tasks  receiving  preemployment  training  is 
reflected  In  the  correspondingly  low  number  of  training  emphasis  areas 
that  were  Identified.  By  the  way,  this  result  corresponds  to  Informa¬ 
tion  we  got  from  employers,  who  said  they  generally  train  programmers 
themselves,  and  do  not  expect  much  training  prior  to  employment.  Of 
the  tasks  for  which  some  training  was  Intended,  meaningful  performance 
standards  existed  for  80%  of  the  Mechanic  tasks,  but  for  only  60%  of 
the  tasks  trained  In  the  other  two  types  cf  programs.  Training  Inten¬ 
tions  can  be  described  in  abbreviated  form  for  each  task,  as  Illustrated 
briefly  in  Table  6,  using  examples  from  the  Secretarial  occupation. 


Table  6 


Sample  Task  Training  Intentions 


Intention 


Level 

Emphasis 

Task 

No  Development 

Assemble  and 

Staple  Dupli¬ 
cated  Materials. 

Less  Than 

Ability  To 

Perform 

Relating  To 
Others 

Greet  Callers 

Or  Visitors. 

Develop  Basic 

Ability  To  Per¬ 
form  (no  special 
standards) 

Basic 

Education 

Place  Telephone 
Calls. 

Develop  Ability 

To  Perform  With 
Advanced 

Proficiency 

Proofread  Type¬ 
written  Copy. 

Develop  Ability 

To  Perform  With 
Advanced 

Proficiency 

Technical 

Knowledge 

Edit  Letters 
Dictated  By 
Employer. 

For  some  of  the  readily  learned  tasks,  such  as  the  "assembling  and 
stapling  of  duplicated  materials,"  no  development  was  Intended  in  pre¬ 
employment  training.  Some  other  tasks  did  not  warrant  the  development 
of  performance  ability,  but  the  training  did  intend  to  Incorporate  an 
emphasis  upon  at  least  one  *ask  area  other  than  performance  of  the  task 
itself. 

In  the  examples  on  this  Illustration,  training  in  task  performance 
was  intended  for  the  last  three  tasks  listed;  the  first  with  no  special 
standards  of  performance,  but  including  an  emphasis  upon  the  learning 
of  some  elementary  communication  skills.  Some  rather  advanced  profi¬ 
ciency  in  task  performance  wus  intended  for  the  last  two  tasks.  Includ¬ 
ing  an  area  of  special  emphasis  for  one  of  them. 
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Uses  For  The  Curriculum  Content  Questionnaire 


The  intention  of  portraying  the  curriculum  content  of  a  training 
program  In  this  manner  Is  to  allow  direct  comparisons  to  be  made  with 
the  results  of  task  surveys  and  other  front-end  analyses  resulting  from 
the  systems  engineering  and  Instructional  design  of  trailing  programs. 
Whatever  the  form  Iri  which  the  present  curriculum  content  me.y  exist,  the 
use  of  the  Curriculum  Content  Questionnaire  appears  useful  In  converting 
that  actual  cor'vnt  to  a  form  more  compatible  with  task  survey  data  and 
analyses  of  th&  job.  It  Indicates,  if  some  training  for  a  task  Is 
Intended,  to  what  general  level  nf  proficiency,  and  what  areas  of  train¬ 
ing  emphasis  are  Intended.  It  serves  to  Identify  what  job  capacities 
get  developed,  though  not  the  enablers  entering  Into  such  learning  and 
development.  These  elements  should  permit  a  reasonable  comparison  to 
be  made  with  the  results  of  analyses  Identifying  performance  training 
needs  of  an  occupation. 

In  addition  to  its  application  In  this  context,  use  of  the  Curricu¬ 
lum  Content  Questionnaire  would  also  appear  to  be  of  potential  utility 
In  several  other  matters.  For  one.  It  could  be  used  to  develop  a 
composite  picture  of  a  training  program  where  that  training  occurs  at 
different  locations  or  through  a  series  of  Instructional  courses,  such 
as  might  occur  In  local  unlt'tralning.  In  another  instance.  It  might 
serve  as  a  useful  means  for  denoting  the  intended  skill  level  of  trainees 
on  particular  tasks,  for  use  In  developing  samples  of  job  performance 
measures  or  other  work  sample  tests,  or  for  use  In  communicating  the 
intention  of  the  school  to  operational  units  (or  other  surh  employers  of 
the  graduates)  so  that  local  units  might  plan  appropriate  assignments 
and  subsequent  on-the-job  training. 

While  this  method  of  relating  task  surveys  to  the  content  of  exist¬ 
ing  training  programs  may  not  represent  a  final  satisfactory  solution  to 
the  problem,  It.  is  hoped  that  It  may  perhaps  serve  your  present  needs 
and  also  stimulate  the  development  of  even  more  useful  methods. 
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Applying  Occupational  Survey  Data  in  Instructional  Systems  Development 


i 


i 


Hendrick  W.  Ruck 
Richard  T.  f'ineen 
Claude  C.  Cunningham 


Although  it  is  generally  agreed  that  occupational  survey  data  are 
extremely  useful  in  developing  training  requirements,  current  Air  Force 
procedures  do  not  specifically  govern  the  application  of  such  data  in 
instructional  systems  development.  This  paper  presents  a  model  developed 
by  the  authors  which  is  based  on  current  Air  Force  instructional  systems 
development  (ISO)  directives.  The  model,  therefore,  is  not  empirically  based, 
rather,  it  follows  established  principles  of  ISO.  The  model  is  proposed 
as  a  guideline  for  training  decisions  with  tne  full  expectation  that  it 
will  be  modified  as  necessary. 

Existing  procedures  for  developing  training  requirements  for  Air  Force 
specialties  normally  result  in  Specialty  Training  Standards  (STS)  which  are 
not  task-specific.  STSs  are  listings  of  items  requiring  formal  training 
for  airmen  in  a  given  specialty.  Correlating  occupational  survey  task  state¬ 
ments  with  STS  items  is  a  difficult,  laborious  chore  which  may  not  always 
be  performed  due  to  the  amount  and  difficulty  of  work  Involved,  A  starting 
point  for  alleviating  this  situation  would  be  the  changing  of  STS  Item  state¬ 
ment  to  occupational  survey  task  statements.  While  this  model  assumes  that 
such  a  change  will  be  made,  the  usefulness  of  the  model  would  not  be  lessened 
should  the  change  be  rejected,  since  newly  developed  computerized  matching 
of  STS  items  and  occupational  task  statements  is  expected  to  be  available 
in  the  near  future. 

The  basic  model  proposed  here  is  comprised  of  three  different  decision 
subsystems.  The  subsystems  flow  sequentially,  however,  the  second  and 
third  subsystems  maybe  applied  concurrently.  First,  tasks  are  selected  for 
training  and  placed  on  the  specialty  training  standard.  Second,  task  skill- 
knowledge  codes  are  assigned.  And  third,  the  formal,  basic  school  training 
course  is  derived.  The  most  significant  underlying  assumption  of  the  model 
is  that  occupational  survey  data  are  a  vaMd  measure  which  may  largely 
determine  training  requirements. 

In  the  interest  of  clarity,  a  definition  Is  appropriate  here.  The 
work  task  has  been  used  in  this  paper  In  referring  to  occupational  survey 
data.  Training  experts  often  use  the  word  task,  and  their  definition  of 
this  word  may  differ  from  that  meant  with  respect  to  occupational  survey 
data.  A  task  is  defined  here  as  a  behavior  which  is  time  measurable,  vhat  has 
a  beginning  and  an  t^d,  and  that  is  understood  to  be  performed  in  only  one 
way.  Performing  a  procedure,  If  the  procedure  is  invariant,  may  be  a  task. 
Trivial  actions,  such  as  inserting  keys  or  removing  specific  screws,  are  not 
usually  considered  to  be  tasks:  rather  .hey  are  elements  of  tasks,  and, 
therefore,  are  not  specified  in  occupational  survey  data. 
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Selection  of  Tasks  for  Training 


Under  this  model,  the  specialty  training  standard  is  developed  by 
listing  all  tasks  performed  by  at  least  ten  percent  of  the  job  incumbents 
of  at  least  one  of  the  skill  levels  (the  Air  Force  Specialty  Training 
Standard  is  designed  for  apprentice,  journeyman,  and  technician  training 
requirements).  That  is,  tasks  performed  by  ten  percent  or  more  of  the  job 
incumbents  at  any  skill  level  represent  a  comprehensive  and  valid  listing 
of  tasks  applicable  for  training  In  the  specialty.  Indication  is  made  on 
the  standard  as  to  which  tasks  are  applicable  to  the  various  skill  levels, 
since  not  all  tasks  are  relevant  at  all  skills. 

In  developing  the  above  criterion,  initially  four  task  factors  were 
considered,  as  suggested  in  the  Air  Force  literature.  The  factors  were: 
percent  members  performing,  learning  difficulty,  probable  consequences  of 
inadequate  performance,  and  task  delay  tolerance.  Using  cutoffs  derived 
from  the  literature,  the  criteria  were  used  to  develop  training  standards 
for  three  Air  Force  specialties.  An  analysis  of  the  resulting  training 
standards  suggested  that  the  ten  percent  cutoff  on  percent  members  per¬ 
forming  was  an  adequate  substitute  for  the  complex  four  factor  criteria 
suggested  by  the  ISO  literature. 


Assignment  of  Task  Ski  11 -Knowledge  Codes 


Once  tasks  are  selected  for  training,  the  establishment  of  skill* 
knowledge  or  proficiency  codes  is  the  next  logical  step.  This  step  is  not 
presently  being  recommended  fo*-  implementation  due  to  conceptual  and  research 
questions  which  arise  in  developing  guidelines.  The  problems  encountered  in 
attempting  to  establish  ski  1 1 -knowledge  codes  is  described  to  emphasize  the 
need  for  further  consideration  of  this  topic  by  policy  makers  and  researchers 
alike.  The  third  subsystem  of  this  model  is  not  dependent  on  this  step, 
therefore,  the  actually  applicable  model  may  be  viewed  as  a  two  step  model  for 
the  present. 

It  was  at  this  juncture  that  difficulty  in  applying  ISO  guidance  was 
encountered.  Six  criteria  were  selected  based  on  ISO  literature.  They  were: 
percent  members  performing,  task  difficulty,  task  delay  tolerance,  probable 
consequences  of  Inadequate  performance,  frequency  of  performance ,  and  time  to 
initial  performance.  Measures  of  the  first  four  criteria  were  derlveo  from 
exact  measurement  (percent  members  performing  was  based  on  a  count,  the  other 
criteria  were  based  on  ratings).  Global  estimates  of  frequency  of  performance 
were  made  based  on  cumulative  time  spent  over  all  tasks.  Finally,  time  to 
initial  performance  was  considered  to  be  "low"  if  30  percent  or  more  of  the 
airmen  within  the  first  year  following  training  performed  the  task,  otherwise 
it  was  considered  to  be  "high." 


After  developing  complex  decision  rules  based  on  the  ISD  literature, 
two  problems  arose.  First,  it  became  quite  clear  that  decision  rules  in¬ 
volving  all  factors  were  not  necessary  for  establishing  proficiency  codes. 
Percent  members  performing  affected  inclusion  into  the  training  standard,  as 
did  time  to  initial  performance.  Also,  frequency  of  performance  did  not 
seem  to  affect  the  proficiency  codes  in  any  meaningful  way.  The  only  factors 
found  to  actually  affect  proficiency  code  levels  were  criticality  {as 
measured  by  probable  consequences  of  inadequate  performance  and  task  delay 
tolerance)  and  learning  difficulty.  More  disturbing,  however,  was  the 
conceptual  problem  of  whether  proficiency  level  is  invariant  for  a  task  or 
whether,  in  fact,  proficiency  changes  over  time  as  individuals  learn  more 
about  the  task  and  gain  additional  experience.  ISD  literature  is  not  clear 
on  this  question.  The  assumption  that  adding  elements  (skills  and  knowledges) 
to  an  occupational  survey  task  changes  the  task  such  that  it  is  a  new  task 
further  confuses  the  issue.  !t  was  felt  that  until  research  findings  and 
policy  decisions  shed  light  on  these  problems,  the  proposed  model  could  not 
realistically  address  the  issue. 


Derivation  of  Formal  Basic  School  Training 


Once  the  tasks  are  listed  on  the  training  standard,  percent  members 
performing  for  first  enlistment  personnel  is  the  primary  factor  for 
inclusion  in  the  resident  course.  In  addition,  extremely  difficult  or 
extremely  critical  (as  measured  by  probable  consequences  of  inadequate 
performance  and  task  delay  tolerance)  tasks  are  included  in  the  course. 
Provision  for  manual  override  is  made  in  the  model,  with  the  understanding 
that  overrides  will  be  justified.  The  algorithms  used  in  the  model  can  be 
programmed  so  that  occupational  data  can  be  displayed  and  tasks  flagged  to 
facilitate  the  construction  of  the  specialty  training  standard. 


SUMMARY 


This  paper  presents  the  results  of  a  concerted  effort  to  Interface 
occupational  survey  data  with  instructional  systems  development.  Unlike 
other  ISD  literature,  this  effort  was  data  based,  using  data  from  three 
different  Air  Force  specialties.  The  use  of  real  world  data  in  applying 
ISO  criteria  allowed  for  a  reassessment  of  the  criteria.  Although  the 
model  has  been  applied  to  one  Air  Force  specialty  by  training  specialists, 
further  testing  of  the  model  is  required  to  provide  a  decision-data-base 
which  will  allow  for  adjustment  of  cutoffs  and  criteria. 

It  is  important  to  note  the  difficulty  encountered  in  developing 
proficiency  codes.  Certainly,  further  research  is  required  on  this  point. 
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Finally,  this  paper  oresents  the  beginnings  of  a  model  which  was 
derived  from  existing  models  and  actual  data.  The  model  offers  a  data-based 
development  of  courses  as  well  as  the  capability  to  expand  and  contract 
courses  objectively.  The  model  is  not  finalized,  it  is  not  being  seriously 
considered  for  adoption.  However,  with  the  appropriate  testing  and  develop¬ 
ment,  It  offers  the  hope  of  objective  proceduralized  training  development  for 
all  training  systems  which  have  access  to  occupational  survey  data. 
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Implementing  Instructional  Technology  in  Army  Training: 
Some  Obstacles  and  Solutions 


Wesley  K.  Roberts 
Courseware,  Inc. 
San  Diego,  California 


This  paper  presents  a  summary  of  contractually  supported 
instructional  technology  assistance  provided  to  the  various  Army 
service  schools  through  the  U.S.  Army  Combat  Arts  Training  Board 
during  the  period  1971-1975.  Assuming  the  role  of  a  change  agent 
is  challenging  without  the  presence  of  obstacles;  imp’ emen ting 
modern  instructional  methods  into  Army  training  proved  to  be  full 
of  obstacles  as  well  as  a  challenge.  Only  through  a  tenacious 
effort  on  the  part  of  a  few  Army  officers  and  noncommissioned 
officers  to  meet  the  change  agent  challenge  have  many  programs 
prominent  in  Army  training  today  survived. 

The  content  of  this  paper  is  candid  and  is  not  available  in 
any  other  singular  source.  It  provides  a  historical  perspective 
on  many  Army  training  programs  today  and  the  effort  required  to 
prepare  the  system  for  their  implementation. 


On  17  December  1971,  the  Army's  Chief  of  Staff  authorized 
the  formation  of  the  U.S.  Army  Combat  Arms  Training  Board  at  Port 
Benning,  Georgia.  This  action  was  the  result  of  a  recommendation 
by  the  Board  of  Dynamic  Training  which  had  been  functioning  since 
August  1971  as  an  investigative  agency  to  consider  ways  of 
supporting  unit  commanders  in  conducting  meaningful  and  exciting 
training.  Formation  of  the  U.S.  Army  Combat  Arms  Training  Board 
allowed  for  implementation  of  recommendations  made  by  the  Board 
for  Dynamic  Training. 1 


Identifying  a  Requirement  for 


Instructional  Technology  Training  Support 


A  major  recommendation  by  the  Board  for  Dynamic  Training  was 
that  a  program  be  developed  that  would  rapidly  restore  deficit 
Noncommissioned  Officer  and  Specialist  confidence  by  upgrading 
training  to  develop  their  professional  competence.  To  meet  the 
challenge  of  correcting  the  gaps  (cf.  Kaufman,  1973)2,  identified 
in  the  training  system,  it  would  be  necessary  to  systematically 
improve  training  provided  to  combat  arms  soldiers  and  ultimately 
to  all  soldiers.  Some  requirements  were  isolated  for  immediate 
implementation  from  which  the  generic  effects  of  a  systematic 
approach  could  be  further  implemented. 

A  priority  requirement  was  for  the  combat  arms  schools  to 
provide  Military  Occupational  Specialty  (MOS)  related  training 
extension  courses  directly  to  soldiers  in  small  units.  These 
courses  were  to  be  prepared  using  a  multimedia  format,  directed 
at  both  individual  and  small  group  training.  The  concept  of  the 
training  extension  courses  was  to  take  subject  matter  expertise 
found  in  the  Army's  service  schools  and  export  it  to  the  soldiers 
" in-the-f ield"  in  the  form  of  up-to-date  training  materials. 
Through  this  method,  individual  training  in  units  would  be  kept 
current  with  service  school  doctrine. 

In  order  to  begin  this  extensive  program  of  extension  course 
development,  it  was  necessary  to  begin  to  identify  and  make 
provision  for  the  support  required.  A  cursory  survey  of  the 
combat  arms  schools  (Air  Defense,  Armor,  Field  Artillery,  and 
Infantry) ,  and  feedback  from  the  implementation  of  similar 
programs,  indicated  a  need  to  provide  training  for  this 
requirement  not  only  at  the  technical  personnel  level,  but  for 
middle  and  senior  managers  as  well.  Crucial  to  the  mission  of 
preparing  these  extension  courses  was  support  at  the  command 
level  within  each  combat  arms  school. 


To  facilitate  the  senior  level  management  support  for  the 
extension  course  program,  a  conference  for  Assistant  Commandants 
of  the  combat  arms  schools  and  representatives  from  the 
Department  of  the  Army  was  held  at  the  United  States  Military 
Academy  at  West  Point,  New  York.  During  this  conference, 
attendees  were  briefed  on  current  techniques  in  instructional 
technology,  as  well  as  the  major  obstacles  to  be  removed  to 
partially  insure  success  of  the  program. 

As  the  plan  for  the  development  of  the  training  packages  in 
extension  course  form,  ergo,  Training  Extension  Course  (TEC),  was 
taking  shape;  considerable  effort  began  to  identify  additional 
training  support  needs.  A  review  of  the  existing  training 
regulations  and  guidance  (CONARC,  later  revised  and  adopted  as 
TRADOC  Regulation  350-100-1  and  the  then  existing  FM  21-6) 
provided  the  following  insight  as  to  the  nature  of  what  training 
may  be  required.  Deficiencies  identified  in  the  CONARC  Reg 
350-100-1  included: 

(1)  little  "how  to"  guidance  for  training  developers, 

(2)  no  overview  of  the  total  system, 

(3)  criterion  tests  were  developed  after  training  materials, 

(4)  knowledges  and  skills  are  not  related  methods, 

(5)  no  developmental  model, 

(6)  nothing  addressed  the  actual  conduct  of  training,  and 

(7)  it  contained  only  a  cursory  section  or.  quality  control. 

Another  source  of  input  for  training  support  requirement  was 
FM  21-6,  Conduct  of  Military  Training.  At  that  time,  FM  21-6 
addressed  resident  instruction  and  did  not  provide  guidance 
concerning  other  training  methods. 

The  training  support  requirement  expanded  when  job  task  data 
information  at  the  service  schools,  as  specified  in  350-100-1, 
was  found  to  be  insufficiently  developed.  Further,  the  Military 
Occupational  Data  Bank  (MODB)  did  not  have  the  required  job 
analysis  information  for  designing  required  instruction.  At  that 
time,  the  information  available  was  aggregated  for  personnel 
purposes.  Due  to  the  deficiencies  identified,  it  became 
necessary  to  initiate  several  research  and  development  programs. 
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A  contract  with  the  Human  Research  Resources  Organization 
(HumRRO)  was  established  for  the  purpose  of  analyzing  job 
requirements  for  the  eight  Military  Occupational  Specialties 
selected  for  initial  TEC  development  (11B,  11C,  11D,  HE,  13A/B, 
13E,  16P,  16R) .  This  action  was  taken  to  resolve  the  immediate 
job  analysis  data  needs  for  systematically  developing  training  in 
the  TEC  program  and  to  serve  as  an  empirical  basis  from  which  new 
directions  in  occupational  research,  in  support  of  training 
requirements,  could  evolve. 

A  systems  engineering  workshop  was  held  to  instruct  service 
school  personnel  how  to  generate  front-end  analysis  data;  how  to 
write  task  statements,  prepare  task  lists,  surveys,  etc.  This 
workshop  was  held  at  several  sites,  conducted  jointly  by  HumRRO 
and  Combat  Arms  Training  Board  personnel,  and  attended  by  service 
school  personnel  identified  to  work  in  the  TEC  program. 

Further,  professional  support  was  provided  to  the  emerging 
TEC  program  by  contractual  assistance  provided  through  the  Army 
Research  Office  and  Battelle  Laboratory's  Durham,  North  Carolina 
Office.  This  allowed  the  Combat  Arms  Training  Board  to  acquire 
the  expertise  of  several  professional  analysts  who  were 
knowledgeable  in  the  field  of  Instructional  Technology.  These 
analysts  contributed  technical  guidance  to  managers  in  TEC 
program  and  other  emerging  programs.  This  service  became  a 
valuable  asset  in  the  TEC  program  and  tremendous  benefit  resulted 
from  its  use. 


More  Evidence  fo:  Training  Support 


Concurrent  with  the  conceptualization  of  the  design  to  be 
used  in  developing  TEC  lessons,  planning  was  made  to  test  this 
approach  by  providing  training  to  soldiers  preparing  for  the  1972 
11B40  (light  weapons  infantryman)  MOS  Test.  This  phase,  known  as 
the  Unit  Training  Extension  Course  (UTEC)  Program  or  TEC  I,  was 
used  to  test  the  effectiveness  of  the  TEC  concept.  Scores  on  the 
November  1972  11B40  MOS  Test  were  used  as  the  criterion  of 
effectiveness . 

A  total  of  56  TEC  lessons  were  developed  by  a  committee 
within  the  U.S.  Army  Infantry  School,  addressing  four  major  test 
domains.  These  lessons  were  largely  prepared  by  Infantry  Officer 
Advance  Course  students  and  were  in  a  35mm  si ide/synchron ;.zed 
sound  cassette  tape  format.  HumRRO  Division  No.  4,  Fort  Benning, 
Georgia  was  contracted  to  design  a  study  to  evaluate  this 
project.  3 
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Results  of  this  investigation  reported  positive  gains  from 
the  use  of  these  instructional  materials  by  soldiers  in 
preparation  for  the  11B40  MOS  Test  when  verbal  ability,  command 
emphasis,  and  study  factors  were  weighted  with  test  scores. 
Otherwise,  no  significant  differences  were  found  between  soldiers 
using  and  those  not  using  the  instruction. 

A  deficiency  found  in  the  conduct  of  this  early  TEC  effort 
were  the  materials  themselves.  The  lessons  were  not  specifically 
designed  to  teach  those  skills  being  evaluated  on  the  11B40  MOS 
Test.  The  relationship  between  the  TEC  I  lessons  and  the  test 
items  evaluated  by  the  November  1972  version  of  the  11B40  MOS 
test  was  ambiguous  with  the  exception  of  one  major  test  domain. 
These  findings  are  not  surprising  in  retrospect:  the  developers 
of  these  lessons  had  no  training  in  instructional  technology 
prior  to  their  involvement  in  this  project;  the  lessons 
themselves  were  simply  "illustrated  lectures",  even  though  they 
represented  a  noteworthy  effort;  objectives  had  not  been 
systematically  prepared  from  use  of  a  job  analysis;  and  the  11B40 
MOS  test  items  did  not  measure  mastery  of  TEC  I  lesson 
objectives.  Further,  systematic  revision  of  lessons  that  did  not 
adequately  instruct  their  objectives  was  not  provided  for  during 
the  developmental  process.  A  recommendation  made  by  HurcRRO,  as  a 
result  of  this  study,  was  to  "systems-engineer"  all  future  TEC 
lessons,  allowing  for  the  lessons  to  concentrate  on  job-relevant 
skills  and  to  be  evaluated  by  job-relevant  performance  tests.* 


The  Requirement  for  Instructional 
Technology  Training  Support 


Abundant  empirical  evidence  clearly  indicated  that  the 
Army's  combat  arms  schools  lacked  personnel  trained  to  prepare 
the  systematic  instruction  demanded  if  the  TEC  program  was  to 
achieve  the  goal  of  rapidly  restoring  Noncommissioned  Officer  and 
Specialist  confidence  and  competence  (in  part)  via  extension 
courses  as  recommended  by  the  Board  for  Dynamic  Training. 

♦During  this  time  frame  the  movement  for  incorporating 
performance  tests  into  the  MOS  testing  program  was  gaining 
considerable  momentum  but  was  not  conceived  in  the  present  Skill 
Qualification  Test  form.  Performance  testing  was  largely 
restricted  in  use  to  Basic  Combat  Training.  The  results  of  this 
and  similar  tests  aided  in  the  move  to  a  more  logical  method  of 
job  performance  evaluation  being  implemented  under  the  current 
Enlisted  Personnel  Management  System. 


Existing  instructor  development  programs  were  evaluated  as 
not  adequate  to  train  personnel  in  instructional  technology.  The 
Army's  systems-engineer ing  model  (them  CONARC,  now  TRADOC 
Regulation  350-100-1)  had  been  reviewed^and  was  found  to  be  an 
inadequate  source  of  instructional  development  guidance. 

A  search  for  an  existing  training  course  in  Instructional 
Technology  outside  the  Army  was  then  conducted.  During  this 
search  a  lesson  development  workshop  that  could  immediately 
facilitate  training  of  personnel  for  the  TEC  program  was 
identified.  The  model  taught  in  this  workshop  is  known  as 
CISTRAIN,  the  acronym  for  Coordinated  Instructional  Systems.5 

Twc  Instructional  Technology  Workshops  (CISTRAIN)  were 
contracted  by  the  Combat  Arms  Training  Board  to  Deter  line 
Associates,  Inc.  The  first  of  workshops  was  held  in  San 
Francisco,  California.  Attending  this  28  November  -  8  December 
1972  workshop  were  35  personnel  from  the  Army's  Combat  Arms 
Schools,  the  Combat  Arms  Training  Board,  and  the  United  States 
Military  Academy  and  other  agencies.  The  goal  of  this  workshop 
was  "to  enable  workshop  attendees  to  further  develop  and 
strengthen  their  knowledge  concerning  instructional  technology 
for  the  purpose  of  conducting  similar  training  for  staff  and 
faculty  members  of  their  respective  schools  that  would  be 
directly  involved  in  the  development  of  TEC  II  instructional 
materials  ...  and  to  further  develop  staff  and  faculty  not  only 
for  designing  and  producing  TEC  II  instructional  materials,  but 
also  to  expand  instructional  technology  expertise  throughout  the 
Army  school  systems,"  6 

Concurrent  with  this  training,  specifically  in  April  of 
1973,  Robert  K.  Branson  and  Robert  Morgan  of  Florida  State 
University  conducted  a  two  day  seminar  for  assistant  commandants 
of  the  Army's  Combat  Arms  Schools  and  other  high  level  officers. 
The  purpose  was  to  inform  high-ranking  officers  of  the 
implications  of  instructional  technology  for  large  scale 
planning,  critical  to  the  success  of  the  long-range  effort. 
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The  second  Instructional  Technology  Workshop  (CISTRAIN)  held 
in  support  of  the  TEC  program  was  conducted  during  the  period  9  - 
20  July  1973  in  Washington,  D.C.  At  this  workshop,  37  attendees 
from  Army  Service  Schools,  the  U.S.  Army  Combat  Arms  Training 
Board,  the  Air  Force,  the  Navy,  and  the  Marine  Corps  received 
instruction  similar  to  those  attending  the  earlier  San  Francisco 
Workshop.  The  goals  for  the  second  workshop  were  somewhat 
modified  from  those  in  the  earlier  one.  They  were:  "(1)  to 
broaden  the  U.S.  Army’s  community  of  education  technologists  by 
offering  training  by  some  of  the  nation's  top  educators,  (2)  to 
further  develop  and  strengthen  the  knowledge  and  skill  of 
workshop  participants  so  they  can  conduct  a  similar  course  for 
staff  and  faculty  members  of  their  respective  schools  and,  (3)  to 
provide  the  participants  with  an  instructional  development  model 
and  teaching  vehicle  that  can  be  used  to  train  instructional 
developers  in  U.S.  Army  Service  Schools  who  develop  training 
materials  for  resident  courses  or  for  export  to  soldiers  in 
units . "  ' 


The  Washington,  D.C.,  workshop  represented  the  training 
identified  as  a  necessary  catalyst  for  follow-up  instruction  at 
five  service  schools  who  were  to  become  the  participants  in  the 
TEC  program  now  known  as  TEC  III.  These  were  the  U.S.  Army 
Engineer  School,  the  U.S.  Army  Southeastern  Signal  School  (now 
the  U.S.  Array  Signal  School),  the  U.S.  Army  Quartermaster  School, 
the  U.S.  Army  Ordnance  Center  School  and  tne  U.S.  Army  Adjutant 
General  School.  Each  service  school  in  attendance  was  provided 
with  a  complete  ret  of  workshop  materials  to  meet  the  requirement 
of  further  training  for  the  TEC  program  personnel. 

Although  considerable  technical  support  and  guidance  was 
being  provided  to  Army  service  school  personnel  for  the  TEC 
program,  it  was  necessary  to  expand  training  support  for  the 
long-range  instructional  system  development  that  was  to  be 
ultimately  implemented , 

The  United  States  Army  Infantry  School,  supported  by  Combat 
Arms  Training  Board  funding  and  contractually  assisted  by 
Insgroup,  Inc.,  began  an  extensive  evaluation  of  its  instructor 
training  course.  As  a  result,  it  was  determined  that  additional 
instruction  was  required  on  the  theory  of  leai.  ing  and  lesson 
development.  All  instruction  was  converted  to  self-paced, 
mediated  materials,  i.e.,  TV,  tape-slide  programs  and  programmed 
texts.  The  number  of  practical  exercises  was  reduced  from  nine 
to  seven  by  eliminating  one  twenty-minute  exercise  and  a  briefing 
requirement.  The  revised  self-paced  course  was  an  improvement, 
but  still  did  not  account  for  diversified  and  individual  training 
requirements.  * 


1 
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At  the  conclusion  of  both  the  San  Francisco  and  Washington, 
D.C.  workshops,  it  was  intended  that  the  participating  service 
schools  would  use  the  workshop  materials  provided  to  conduct 
subsequent  workshops  to  train  a  substantial  number  of  pfople  in 
the  basics  of  instructional  technology.  Nevertheless,  this 
program  was  never  fully  realized.  Although  several  schools  made 
noteworthy  attempts,  only  the  U.S.  Army  Armor  School  (who  sent 
their  Education  Advisor  to  the  course)  ever  really  implemented 
follow-on  training  using  these  materials.  In  retrospect,  son# 
reasons  for  this  failure  may  be  intuitively  identified.  CISr.-AiN 
was  not  Army  doctrine  nor  was  it  ever  officially  sanctioned  a.-v  t> 
faculty  development  program  to  be  institutionalized  in  the  Anr<;' 
Service  schools  did  not  (in  many  cases)  send  a  faculty  developer 
i.e.,  an  instructor  for  instructors,  to  the  workshop  -  hence  who 
was  going  to  subsequently  follow  through  with  the  training 
program?  Many  high  level  managers  did  not  perceive  this  to  be  a 
viable  solution  to  their  training  problem  ( J nstruct.icnal 
technology  was  new  to  them)  and  did  not  choose  to  support  the 
training  of  additional  personnel.  There  was  no  follow-on 
training  provided  to  potential  course  managers  to  insure  they 
could  teach  the  workshop.  And,  the  school',  perceived  (in  many 
cases)  that  a  contractually  supported  effort  for  the  development 
of  TEC  lessons  relieved  me  necessity  for  such  training. 


The  Broadening  of  the  Training  Requirement 


During  the  period  when  the  TEC  program  was  contractually 
adding  "trained"  Instructional  Technologists  to  the  organizations 
participating  in  this  endeavor,  the  U.S.  Army  Combat  Arms 
Training  Board  was  busy  supplementing  this  training  with 
internally  developed  seminars,  conferences,  and  workshops 
designed  for  various  tasks  necessary  in  developing  systematic 
instruction.  The  zietgeist  prevailing  in  the  Army  training 
community  was  emerging  as  that  of  meeting  the  training 
requirements  head-on  with  the  latest  developments  in 
instr uctionj-  technology.  Said  another  way,  the  "tip  of  the 
iceberg"  which  would  lead  to  the  total  commitment  in  the  Army 
toward  the  full  exploitation  of  training  technology  during  the 
middle  anu  late  1970's  was  beginning  to  emerge. 
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Contemporary  issues  at  the  national  level  were  often  leveled 
at  the  post-Vietnam  military  establishment  in  terms  of 
substantial  budget  cuts.  Congressional  leaders  were  being  forced 
to  call  for  "more  bang  for  the  buck"  and  "more  teeth  than  tail." 
Consciousness  was  being  raised  at  the  highest  levels  across  the 
armed  services  of  the  desirability  of  less  time  spent  in  training 
and  more  time  in  deployable  combat  and  combat  support  units? 
toward  higher  levels  of  professionalism  on  the  part  of  armed 
service  personnel;  toward  increased  efficiency  and  proficiency  in 
job  performance;  and  all  of  this,  for  less  training  cost. 

One  alternative  to  the  tremendous  pressures  of  the  day  was 
to  conduct  interservice  training  when  the  tasks  to  be  trained 
were  common  across  one  or  more  service.  While  there  has  always 
been  an  exchange  of  training  ideas  and  some  programs  throughout 
the  history  of  our  armed  forces,  it  has  not  been  without 
trepidation  of  the  "Purple  Suit,"  wherein  the  various  military 
services  lose  their  distinct  identity,  a  concept  insensitive  to 
the  ever  prevailing  philosophy  that  our  specialized  armed 
services  (e.g.,  land,  sea,  air)  require  specialized  training  to 
perform  their  distinctly  assigned  missions.  Additionally,  there 
are  the  manpower  and  budget  consequences  of  turning  over  a 
portion  of  training  to  another  service.  These  considerations 
provide  understandable  reasons  for  the  military  establishment's 
reluctance  to  conduct  large-scale  interservice  training. 

In  an  effort  to  ferret  out  solutions  to  the  dilemma  posed  by 
fhe  training  resource  and  manpower  constraints  being  placed  on 
the  post-Vietnam  armed  services,  the  commanders  of  the  four 
Military  commands  met  in  Washington,  D.C.  during  the  September  of 

to  establish  an  Interservice  Training  Review  Board.  The 
purpose  of  the  Board  was  to  promote  economy  in  training  through 
tbc  use  of  interservice  training.  Subordinate  to  the  Board  were 
sev<»>  cl  committees  constituting  the  Interservice  Training  Review 
Organization.  Each  of  these  committees  were  further  broken  down 
fct  more  specific  functions  for  which  subcommittees  were  formed. 
One  of  these  subcommittees,  the  Interservice  Subcommittee  for 
instructional  Systems  Design,  was  formed  when  its  predecessors 
function  (standardization  of  a  training  glossary  for  the  armed 
oen>ice3)  was  redirected.  Concomitant  with  this  redirection  was 
the  broadening  of  the  new  subcommittee's  function.  It  was 
charged  to  develop  a  model  and  set  of  procedures  for  the 
development  of  curriculum  for  interservice  training  programs. 9 
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Paralleling  these  activities  were  several  on-going  projects 
within  the  Army  independently  working  toward  partial  solutions  to 
the  larger  issue  of  systematic  training  development  and 
management  for  the  total  training  system.  There  was  the  HumRRO 
8-MOS  study,  the  revision  of  TRADOC  Regulation  350-100-1, 
research  funded  hy  the  Army  Research  Institute  on  Criterion- 
Referenced  Tests,  the  TEC  program,  a  Baseline  study  designed 
to  identify  common,  semi-common,  and  unique  soldiering  tasks,  the 
Experimental  Volunteer  Army  Training  Program  (EVATP)  at  Ford  Ord, 
California,  the  need  to  assist  service  schools  in  expanding 
instructional  technology  skills  to  more  personnel,  and  the 
potential  adaptation  by  the  Army  of  Air  Force  Pamphlets  50-2  and 
50-58  on  Instructional  Systems  Design.  In  order  to  capitalize  on 
this  and  other  research  and  development  efforts,  the  Combat  Arms 
Training  Board  formalized  a  contract  with  Florida  State 
University  for  the  purpose  of  assisting  the  Army  in  bringing  it's 
spectrum  of  training  research  and  development  into  a  total 
working  system. 


The  Combat  Arms  Training  Board 
and  the  Interservice  Project 


Not  oblivious  to,  yet  separate  from  the  Interservice  action 
mounting  in  training  circles  in  the  early  1970's,  the  Combat  Arms 
Training  Board,  in  fulfilling  its  charge  stemming  from  the 
recommendations  of  the  Board  for  Dynamic  Training,  and  in  an 
effort  to  investigate  the  actual  status  of  instructional 
capabilities  within  the  Army  were,  had,  or.  29  May  1973  entered 
into  a  contractual  arrangement  with  the  Center  foe  Educational 
Technology  of  the  Florida  State  University.  The  original  purpose 
of  this  contract  was  to  survey  the  state  of  the  art  in 
Instructional  Technology  in  the  Army. 

Preliminary  reports  from  the  Florida  State  Study  cogently 
indicated  a  requirement  for  the  Army  to  develop  an  instructional 
technology  manual  which  would  lend  harmony  to  the  guidance  and 
initiatives  in  training  research  and  development.  Based  on  these 
early  findings  and  recommendations,  the  Combat  Arms  Training 
Board  expanded  the  scope  of  work  under  contract  with  the  Florida 
State  University  to  allow  for  the  design  of  an  instructional 
technology  manual  and  supporting  workshop  materials. 


On  6  September  1973,  members  of  the  U.S.  Army  Combat  Arms 
Training  Board  met  in  Atlanta,  Georgia  with  Major  General  Ira 
Hunt,  the  Deputy  Chief  of  Staff  for  Individual  Training,  U.S. 

Army  Continental  Army  Command  {now  reorganized  and  the  Service 
School  and  Training  Center  mission  is  assigned  to  the  new  U.S. 
Army  Training  and  Doctrine  Command  -  TRADOC) .  General  Hunt  was 
briefed  on  the  findings  and  recommendations  of  the  study 
performed  by  the  Florida  State  University.  Based  upon  this 
report,  General  Hunt  approved  the  Combat  Arms  Training  Board  to 
continue  its  contractual  development  of  an  instructional 
technology  manual  which  would  be  designed  to  replace  CONARC 
Regulation  350-100-1,  and  the  necessary  workshops  designed  to 
train  and  implement  the  new  instructional  technology  manual  at 
the  service  school  top  level  management,  middle  management,  and 
instructor  level.10 

Earlier,  on  26  July  1973,  the  Interservice  Subcommittee  on 
Instructional  Systems  Design  met  at  Fort  Benning,  Georgia,  to 
resolve  procedural  questions  in  connection  with  the  design  and 
development  of  a  model  and  set  of  procedures  for  Interservice 
curriculum  development  A1  This  meeting  was  called  in  light  of  the 
committee's  knowledge  of  the  contract  between  the  U.S,  Army 
Combat  Arms  Training  Board  and  the  Florida  State  University. 

This  meeting  led  to  an  agreement  between  the  U.S.  Army  Combat 
Arms  Training  Board,  the  Florida  State  University,  and  the  Inter¬ 
service  Subcommittee  on  Instructional  System  Design,  that  the 
contracted  work  could  possibly  be  redirected  without  serious 
imposition  to  serve  the  needs  of  not  only  the  Army,  but  the 
interservice  community  as  well.  This  meeting  led  to  the  eventual 
inclusion  of  Interservice  participation  in  the  Combat  Arms 
Training  Board's  contract  with  Florida  State  University. 

While  the  Incerservice  Committee  had  the  charge  of  develop¬ 
ing  a  model  and  set  of  procedures  for  inter  service  curriculum 
development,  the  U.S.  Army  Combat  Arms  Training  Board  required 
broader  research  and  development  investigation  for  the 
improvement  of  Army  training.  Therefore,  from  time  to  time, 
modifications  were  made  and  tasks  added  to  the  contract  with  the 
Florida  State  University  that  addressed  the  specific  requirements 
of  both  the  U.S.  Army  Combat  Arms  Training  Beard  and  the 
Interservice  Committee.  The  research  and  development  products 
delivered  by  the  Florida  State  University  ranged  from  a  report 
summarizing  the  state  of  the  art  in  instructional  technology  in 
the  Army,  a  prototypic  TEC  audiovisual  kit,  a  manual  for 
preparing  extension  rourse*  ,  a  model  for  service  school  staffing^ 
and  a  model  and  set  of  procedures  for  interservice  curriculum 
development . 


The  Interservice  Procedures  for  Instructional 


Systems  Development 


The  singularly  most  significant  product  developed  by  the 
Florida  State  University,  the  one  most  often  referred  to,  was  the 
five-volume  set  of  manuals  titled  Interservice  Procedures 
for  Instructional  Systems  Development.^  These  procedures  are 
often  thought  of  as  the  only  product  resulting  from  the  Combat 
Arms  Training  Board  -  Florida  State  University  contract.  This  is 
not  true.  Although  their  preparation  could  not  have  been 
accomplished  without  some  of  the  products  from  other  contractual 
tasks,  these  manuals  represent  part  of  the  deliverables  from  Task 
5  only  (there  were  a  total  cf  eight  contractual  tasks)  .  It  is 
important  to  note  here  that  the  major  intent  of  Florida  State 
University's  research  became  that  of  preparing  a  manual  for 
TRADOC  schools.  It  was  subsequently  modified  to  the  preparation 
of  a  set  of  manuals  on  Instructional  Systems  Development  for 
Interservice  Training. 

Input  to  and  the  development  of  these  procedures  was  a 
massive  process.  Possibly  the  most  difficult  task  performed  by 
the  authors  of  the  procedures  was  to  restrict  che  narrative  to 
content  pertinent  to  interservice  curriculum  developers.  One  of 
the  methods  used  to  insure  the  content  of  the  procedures  was 
relevant  to  their  intended  target  population  was  to  conduct 
formative  evaluation  trials  on  both  the  procedures  themselves, 
and  the  three  levels  of  training  workshops  designed  to  support 
their  implementation  and  use. 

Formal  review  of  the  procedures  began  during  November  1974. 
Phases  I  and  il  (Analyze  and  Design)  of  the  procedures  were 
evaluated  in  an  interservice  workshop  held  at  Fort  Benning, 
Georgia.  Phases  III,  IV,  and  V  were  evaluated  in  an  interservtce 
workshop  held  at  the  Naval  Training  Center,  San  Diego,  California 
during  February  1975.  Further,  the  manuals  were  reviewed  by 
experts  in  all  Army  Service  Schools  during  the  spring  of  1975. 

The  aggregated  data  and  comments  from  attendees  were  used  for 
revision  of  the  procedures,  and  a  revised  version  of  the 
procedures  was  published  in  July  1975.  Critical  to  the  under¬ 
standing  of  this  program  is  it's  separation  from  any  particular 
program  within  any  one  service.  The  formative  evaluation 
workshops  were  never  intended  to  serve  as  catalysts  for  the 
promulgation  of  these  procedures  within  the  services.  They  were, 
simply,  "validation  trials."  These  workshops,  for  training  the 
procedures  to  the  technical,  middle  manager,  and  senior  manager 
levels,  were  validated  in  workshops  at  Fort  Benning,  Georgia; 

Fort  Eustis,  Virginia;  Naval  Training  Center,  San  Diego, 
California;  Fort  Gordon,  Georgia;  Tallahassee,  Florida,  and 
Pensacola,  Florida. 
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Status  of  the  Interservice  Procedures  for 


Instructional  Systems  Development 


Although  the  model  had  been  approved  at  the  close  of  1975, 
the  revised  version  (July  1975)  of  the  Interservice  Procedures 
Manual  was  not  adopted  by  the  Inter  service  Training  Review  Board. 
The  Interservice  Committee  for  Instructional  Systems  Development 
prepared  a  report  for  the  Interservice  Board  on  pilot  projects, 
implementing  the  procedures  in  actual  training  setting  (c.f., 
Scanland,  1977)  The  philosophy  of  the  Board  being  one  of 
insuring  the  procedures  are  reliable  and  usable,  several  pilot 
projects  representing  different  research  methods  are  planned  by 
the  services  to  evaluate  the  procedures.  The  largest  effort  was 
a  contractually  supported  project  at  the  U.S.  Army  Signal  School, 
Fort  Gordun,  Georgia.  This  pilot  implementation  of  the 
procedures  did  not  serve  the  exact  purpose  of  the  Interservice 
Community.  It  was  designed  with  the  Army  in  mind  under  the 
philosophy  that  the  interservice  procedures  were  generic  to  all 
services,  and  specific  enough  to  test  within  the  Army.  This 
project  was  never  fully  completed  and  was  redirected  to  address 
job  performance  aids  and  other  training  methods.  The  Navy  did 
not  implement  a  pilot  program;  nevertheless,  the  procedures  were 
published  by  the  Navy  as  directive  NAVEDTRA  106A,  The  Army 
published  the  manuals  as  TRADOC  Pamphlet  350-30.  The  Air  Force 
and  Marine  Corps  did  not  assign  an  identifying  number  within 
their  service  to  the  interservico  procedures.  The  general 
precept  throughout  the  military  training  establishment  became 
that  of  while  the  procedures  were  developed  for  an  interservice 
purpose,  they  provided  considerable  "how  to"  guidance  and 
reference  for  use  a  a  resource  document.  Even  though  there  was  a 
continuing  request  for  instructional  technology  workshops,  these 
particular  materials  remained  at  the  close  of  1975  as  relatively 
untested.  The  workshop  materials  and  the  procedural  manual  are 
currently  being  made  available  to  the  i.ainers  within  the  various 
services,  and  it  is  anticipated  that  there  will  be  several 
alternative  methods  employed  to  use  these  materials  as  training 
documents . 
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Future  Requirements  for  Contractually  Supported 
Generic  Instructional  Technology  Workshops 


Over  the  past  several  years  (post-Vietnam  era)  the  Army  has 
been  streamlining  training  and  personnel  management  systems  for  a 
myriad  of  reasons.  During  1974-75,  two  significant  programs  were 
developing  which  will  have  considerable  and  widespread  impact  in 
the  Army  over  the  next  several  years.  These  are  the  Enlisted 
Personnel  Management  System  and  the  Officer  Personnel  Management 
System.  The  U.S.  Army  Training  and  Doctrine  Command  has 
participated  in  their  development  since  the  beginning  of  these 
systems.  Calling  for  a  significant  restructuring  of  the  training 
support  for  career  development  in  both  the  enlisted  and  officer 
ranks,  these  systems  provide  for  the  multilevel  structuring  of 
training  to  support  actual  and  potential  personnel  assignments. 
Concurrent  with  the  development  of  these  systems,  considerable 
training  resource  and  manpower  constraints,  the  creation  of  three 
new  combat  divisions  (for  a  total  of  16  in  the  Active  Army) 
within  existing  personnel  limits  placed  on  the  Army,  and  other 
major  considerations,  have  caused  the  U.S.  Army  Training  and 
Doctrine  Command  to  investigate  other  advanced  technologies  to 
achieve  mission  success  within  the  tumultuous  environment  it 
operates  in. 

A  very  definite  direction  toward  the  self-pacing  of 
instruction  resulted  as  an  outcome  of  an  Instructional  Technology 
Symposium  held  at  Fort  Eustis,  Virginia  in  1975. 14  The  symposium 
focused  on  problems  confronting  resident  Army  service  school 
training  and  the  requirement  to  provide  training  to  soldiers  in 
units.  The  symposium  resulted  in  two  major  directions  to  Army 
service  school  commandants.  They  were:  (1)  a  charge  from  General 
William  E.  DePuy,  the  Commanding  General  of  the  U.S.  Army 
Training  and  Doctrine  Command  at  that  time,  to  provide  command 
level  support  to  self-pacing  initiatives,  and  (2)  to  use  the 
self-pacing  method  whenever  possible.  This  direction  was 
directed  at  developing  revised  and  new  training  programs  in  a 
more  cost-effective  manner  and  to  enable  soldiers  to  master 
resident  school  training  based  on  their  individual  capabilities 
as  measured  against  established  criteria.  The  later  effect  was 
specifically  an  attempt  to  place  soldiers  into  units  as  soon  as 
possible  by  reducing  their  time  in  the  training  base. 

Empirically,  self-paced  ptograms  tend  to  reduce  training 
time  upwards  of  25  percent  and  cut  training  costs  overtime 
without  loss  in  student  performance.  In  many  instances,  the 
actual  student  performance  has  resulted  in  significant 
improvement  over  scores  recorded  in  more  traditionally  conducted 
cour  ses . 
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With  the  advent  of  the  Enlisted  Personnel  Management  System 
came  the  use  of  constrained  task  lists.  This  process  was  one  of 
eliminating  tasks  from  training  programs  by  analyzing  their 
relative  importance  to  the  overall  goal  of  the  course  in  which 
they  were  contained.  The  constraining  of  tasks  to  be  taught 
resulted  in  reduction  of  some  training  time  and  consequently 
training  costs.  This  procedure  was  by  itself  a  major  change  in 
the  direction  from  traditional  training  methods. 

The  development  of  Soldier's  Manuals  and  their  corresponding 
Skill  Qualification  Tests  established  Skill  Qualification  Tests 
established  not  only  defined  tasks  for  job  performance  within  a 
Military  Occupational  Specialty,  but  the  precise  criteria  for 
measuring  task  mastery.  These  two  components  of  the  Enlisted 
Personnel  Management  System  are  now  the  focal  point  for 
establishing  remedial  training  for  soldiers  not  performing  at  the 
required  criterion  level  within  their  job  on  a  task  basis. 


As  a  result  of  many  events  culminating  in  the  10-11  December 
1975  TRADOC  Commanders  Conference,  a  new  requirement  was 
identified  to  provide  self-pacing  workshops  for  Army  training 
personnel.  This  requirement  came  about  due  to  the  accelerated 
nature  of  training  technology  developments  within  the  Array  to 
achieve  its  training  mission  vis-a-vis  the  economic  pressures  and 
manpower  restrictions  of  the  day. 

In  order  to  provide  adequate  training  in  the  technology  of 
self-paced  instruction,  the  Training  Management  Institute  (now 
the  Training  Development  Institute)  integrated  relevant 
instructional  materials  that  exist  within  the  Army  with 
pertinent,  ind  essential,  validated  self-paced  training  materials 
and  expertise  available  in  the  academic  and  industrial 
communities . 
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Contractual  Assistance  Required  Beyond  the  Immediate  Future 


It  would  be  a  sophomoric  assumption  to  state  that  the 
contractual  support  in  instructional  technology  workshops 
completed  in  the  recent  years  by  funding  support  from  the  U.S. 
Army  Combat  Arms  Training  Board  represents  more  than  a  beginning 
to  what  may  be  demanded  in  the  long  range  future  of  training 
developments  within  the  Army.  Striving  for  excellence  in 
training  programs  at  such  a  rapid  pace  and  with  such  tremendous 
in-house  personnel  and  resource  restrictions  makes  it 
inconceivable  to  identify,  at  this  juncture,  a  training  workshop 
in  instructional  technology  either  within  or  outside  of  the 
military  which  would  be  generic,  yet,  specific  enough  to  address 
the  training  needs  of  the  instructional  technology  personnel 
involved  in  improving  Army  training  now  and  in  the  future. 
Therefore,  the  Army  is  preparing  to  offer  a  basic  course  in 
instructional  technology  with  satellite  workshops  designed  for 
more  precisely  defined  training  development  and  management  tasks. 
The  ability  of  the  Training  and  Doctrine  Command  to  meet  its 
training  missions  of  the  future  may  be  largely  determined  by  its 
ability  to  provide  rapid  re"  nse  to  urgent  training  requirements 
with  surgical  precision. 


Summary 

To  summarize  the  challenge  that  faced  the  Combat  Arms 
Training  Board  in  fulfillment  of  its  change  agent  role  for  Army 
training,  several  obstacles  may  be  identified.  They  are:  (1) 
the  existing  information  system  was  not  prepared  to  address 
training  problems;  (2)  there  was  a  shortage  of  instructional 
technologists  who  could  be  instrumental  in  developing  improved 
training  materials;  (3)  responsible  commanders  required 
briefings,  and  in  some  cases  convincing,  on  the  benefits  of 
implementing  modern  instructional  methods,  and  (4)  the  total 
effort  required  centralized  control  and  guidance  to  insure  proper 
system  integration. 

The  solutions  to  these  major  obstacles  came  about  through 
(1)  a  large-sccle  effort  to  place  job  analysis  information  into 
the  existing  data  base;  (2)  the  conduct  of  workshops  to  provide 
initial  instructional  technology  training  to  personnel  involved 
in  this  effort;  (3)  briefings  to  inform  responsible  commanders 
and  their  staff  of  the  immediate  and  long-range  benefits  of 
instructional  technology  to  insure  their  support  in  this  effort; 
and  (4)  the  Combat  Arms  Training  Board  being  established  by  the 
Army  as  the  principal  agency  for  control  and  guidance  of 
implementing  modern  instructional  technology  into  Army  training. 
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A  Final  Note 


This  chronology  has  attempted  to  provide  a  historical 
perspective  on  major  initiatives  in  instructional  technology  that 
were  supported  by  the  Combat  Arms  Training  Board.  However,  many 
significant  initiatives,  i.e.  training  simulaltion,  sub-caliber 
training  devices  and  gaming  simulaltion  and  others;  were  not 
addressed  in  this  paper.  This  is  not  due  to  oversight,  rather  to 
the  focus  of  the  paper  being  on  generic  instructional  technology 
training  support  and  not  on  specific  training  programs. 
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INTRODUCTION 


The  U.  S.  Coast  Guard  Training  Center  at  Governors  Island  is  responsible 
for  resident  training  in  four  general  categories: 

(1)  Technical  training  to  qualify  personnel  for  job  entry  in  seven 
specialty  ratings; 

(2)  Advanced  training  in  maintenance  and  operation  of  specialized 
equipment  in  four  specialty  ratings; 

(3)  Mission-related  training  foi  officers  and  rated  petty  officers 
in  two  mission  areas; 

(4)  Program-related  training  for  rated  petty  officers  in  two  program 
areas . 

For  convenience,  these  categories  are  placed  in  two  broad  areas.  Tech¬ 
nical  training  courses  for  job-entry  qualification  are  grouped  into 
Class  "A"  Schools,  the  others  are  grouped  into  Class  "C"  Schools.  Over 
2000  students  arc  graduated  annually,  approximately  50%  in  each  broad 
area.  There  is  great  diversity  among  the  individual  categories,  and  even 
greater  variety  when  these  categories  are  broken  into  specific  courses. 
For  example,  in  the  Electronics  Technician  (ET)  job  entry  category  alone, 
there  is  one  basic  core  curriculum,  and  three  advanced  tracks  (an 
integrated  series  of  courses  leading  to  qualification)  which  combined 
comprise  a  total  of  36  individual  courses. 

Managing  a  Training  Division  characterized  by  so  many  unique  Branches, 
Sections,  Schools  and  Tracks  is  a  significant  challenge. 

This  is  not  a  research  paper,  nor  is  it  a  "How  to  Do  It"  paper.  It  is 
intended  to  raise  some  questions,  present  problems  which  we  face  in 
common,  and  to  seek  assistance.  The  basic  question  is  -  How  do  we  know 
that  what  we  are  doing  is  effective  and  productive? 

As  training  managers.,  we  need  to  have  some  way  to  measure  the  effec¬ 
tiveness  and  productivity  of  our  training  systems.  These  measurements 
may  be  as  specific  or  as  general  as  meets  our  needs,  but  in  any  case,  we 
must  be  able  to  determine  how  well  we  are  doing  with  wiat  we  have  to  work 
with.  A  military  training  manager's  primary  job  is  to  meet  the  training 
needs  of  the  "field".  We  must  be  concerned  not  only  with  the  quantity  of 
graduates,  but  with  the  quality  of  the  graduate's  performance  as  well. 
Secondarily,  but  closely  behind  in  importance,  we  must  insure  that  we  are 
using  the  taxpayer's  money  wisely. 

We  have  a  distinct  disadvantage  m  contrast  to  our  counterparts  in 
civilian  education  and  industry.  Most  of  us  are  not  professional 
trainers.  We  are  professional  soldiers,  sailors,  or  airmen  who,  by 
choice  or  otherwise,  find  ourselves  in  the  role  of  training  manager  -  a 
role  which  we  may  play  for  only  a  relatively  brief  period  in  our  military 


careers.  Some  of  us  may  have  had  the  advantage  of  formal  training  to 
prepare  us  for  this  role.  For  the  majority  of  us,  that  is  probably  not 
the  case.  We  have  learned  what  we  know  about  managing  training  mostly 
through  the  legacy  of  our  predecessors.  Ours  is  certainly  not  the  ideal 
system,  but  it  is  one  which  has  certain  advantages  when  viewed  in  the 
overall  military  context.  If  our  predecessors  have  built  a  strong  and 
well-documented  program,  then  we  have  something  to  build  upon.  If  not, 
then  we  have  to  scratch  for  our  own  solutions  until  we  can  build  our  own 
functional  management  systems. 

Our  predecessors  at  Coast  Guard  Training  Center  Governors  Island  have 
been  kindly.  Nonetheless,  we  incumbents  have  found  that  no  tools  have 
been  left  behind  to  help  us  to  make  the  vital  training  effectiveness  and 
training  productivity  measurements.  How  do  we  know  if  our  programs  are 
effective?  How  do  we  know  if  they  are  productive?  How  do  wc  know  if 
changes  are  necessary?  And  if  so,  what  changes?  Where  are  our  bench¬ 
marks? 


BACKGROUND 

The  training  philosophy  at  Coast  Guard  Training  Center  Governors  island 
is  undergoing  an  evolution.  Perhaps  not  as  dramatically  as  the  evolution 
of  training  in  our  counterpart  DOD  organizations,  but  an  evolution  none¬ 
theless.  We  are  progressing  in  all  of  the  individual  and  integrated 
courses  offered  from  traditional  subject-matter  based  instruction  to 
performance-based  instruction.  This  evolution  began  in  1973  with  the 
conversion  of  ET  training.  And,  following  our  experience  in  that  pro¬ 
gram,  it  is  being  carried  out  in  all  of  the  other  rating  areas  as  well. 
Because  the  conversion  in  ET  training  is  complete,  we  will  be  using  that 
as  a  primary  example  in  this  paper. 

The  previous  ET  graduate  was  a  generalized  specialist.  He*  knew  a  great 
deal  about  a  lot  of  things  related  to  electronics.  He  was  well  educated 
in  his  discipline,  but  not  necessarily  well  trained  to  perform  the 
technical  tasks  expected  of  him  upon  job  entry  at  his  first  duty  tour 
The  ET  could  expect  to  be  assigned  to  cither  a  Coast  Guard  cutter  or  to  an 
isolated  I.ORAN  station,  and  since  they  all  received  identical  training 
heavily  loaded  with  electronics  "theory",  none  could  be  considered  by 
present  standards  to  be  qualified  to  maintain  and  repair  the  specific 
equipment  found  at  his  first  duty  station.  It  was  as  if  we  were  grad¬ 
uating  apprentice  electronics  engineers  rather  than  apprentice  electron¬ 
ics  technicians.  Consequently,  a  period  of  OJT  following  job  entry  was 
required  to  enable  them  to  do  what  their  training  should  have  prepared 
them  to  do.  This  was  certainly  not  an  effective  nor  a  productive  train¬ 
ing  method. 

The  performance-based  system  which  we  have  implemented  is  designed  to 
insure  that  the  student  can  perform  those  tasks  which  will  be  required 

*For  convenience  only,  the  masculine  pronoun  will  be  used  throughout, 
rather  than  he/she  or  him/her. 
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of  him  upon  job  entry,  no  more  and  no  less,  and  to  perform  them  up  to  the 
requisite  standards  prior  to  graduation.  His  performance  is  reinforced 
through  a  series  of  "real  life"  performance  tests  in  which  he  is  required 
to  trouble-shoot  and  repair  the  exact  same  type  of  equipment  that  he  will 
be  working  with  at  his  first  duty  station.  So,  by  matching  our  training 
objectives  closely  to  the  actual  job  tasks,  and  our  performance  testing 
standards  closely  to  job  performance  standards,  we  should  be  able  to 
insure  that  the  graduate  is  able  to  directly  transfer  his  training 
experience  to  his  work  immediately  upon  job  entry  with  minimal  OJT. 

We  completed  the  £T  conversion  in  1976.  We  were  satisfied  that  the  new 
system  was  effective.  After  all,  we  were  able  to  reduce  the  basic  ET 
core  curriculum  from  20  to  12  weeks  by  cutting  out  all  of  the  "nice  to 
know"  theory.  Granted,  the  new  performance-based  training  was  an  order 
of  magnitude  more  expensive  than  the  old  subject-matter  based  system 
because  of  the  necessity  to  procure  more  of  the  actual  electronic  devices 
used  in  the  field,  and  more  test  equipment,  but  we  were  turning  out 
bette  '  technicians.  But,  were  we? 

The  first  reaction  from  the  field  was  negative.  The  new  technicians 
"didn't  know  anything".  No  specific  criticism,  they  were  just  "not  as 
good  as  they  used  to  be".  These  criticisms  caused  us  great  concern.  We 
sensed  that  the  old  system  was  not  effective,  ami  was  unproductive,  but 
the  field  had  been  satisfied  for  years'  We  believed  that  the  new  system 
was  extremely  effective  and  productive,  yet  the  field  griped'  The  only 
way  to  insure  that  what  we  were  doing  was  not  only  effective,  but 
productive,  and  to  respond  to  the  field  criticism,  was  to  measure  our 
effectiveness  first,  and  then,  using  that  data  to  see  how  productive  we 
were.  Direct  and  objective  feedback  on  the  performance  of  our  graduates 
from  the  field  was  needed. 

The  decision  was  to  conduct  surveys  of  the  performance  of  graduates  who 
had  been  on  the  job  for  a  minimum  of  6  months  and  a  maximum  of  one  year. 
In  order  to  be  useful  in  measuring  the  effectiveness  of  training,  the 
surveys  had  to  be  specific.  Each  job-task  related  to  the  newly  learned 
technical  skill  had  to  be  probed. 

MEASURING  EFFECTIVENESS 

In  order  to  base  surveys  of  training  effectiveness  on  fundamentals  which 
would  remain  relatively  constant,  thereby  enabl in^  re-surveys  to  measure 
the  same  aspects  for  comparison,  the  supposition  was  put  forth  to 
formulate  survey  questionnaires  on  the  Enlisted  Qualifications  Manual 
(CG- 311),  the  paradigm  of  qualifying  standards  for  each  rate,  to  produce 
the  quality  product  required.  Conferences  were  .ield  with  the  School 
Chiefs  and  lnstiuctors  to  ve r i f y  that  CG-311,  their  curricula,  and  their 
Performance  Objectives  were  indeed  the  same.  Since  Governors  Island 
Training  Center  had  recently  reviewed  and  updated  all  Performance 
Objectives  to  meet  the  requirements  of  the  Qualifications  Manual, 
accordance  was  forthcoming,  and  the  criterion  was  selected. 
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The  format,  chosen  for  the  questionnaires  was  a  job  task  inventory 
developed  from  the  tasks  enumerated  in  CG-311  for  E-4s.  The  imperative 
duties  to  be  evaluated  by  the  field  were  extrapolated,  and  final 
selections  were  made  by  a  concurrence  of  Branch  Chiefs  and  Instructors  of 
the  particular  schools. 

The  questionnaire  packet  consists  of  a  preparatory  letter  sent  to  the 
recent  graduate  two  weeks  before  the  arrival  of  the  questionnaire.  Two 
weeks  later,  the  entire  packet  is  sent  to  the  graduate's  Commanding 
Officer.  It  is  requested  that  the  Commanding  Officer  forward  the  grad- 
duate's  questionnaire  to  the  individual,  that  the  Supervisor's  question¬ 
naire  be  forwarded  to  the  person  most  qualified  to  evaluate  the 
graduate's  performance,  and  finally  that  the  Commanding  Officer  complete 
a  critique  sheet  evaluating  the  Supervisor's  appraisal.  An  open-ended 
question  is  included  in  each  packet  for  respondents  to  indicate  "missing 
training  elements",  for  possible  future  curriculum  inclusion. 
Additionally,  a  reminder  letter  is  sent  to  the  graduate  in  the  event  his 
questionnaire  is  not  returned  within  a  month  from  the  time  of  mailing. 
3oth  graduate's  and  Supervisor's  questionnaires  are  comprised  of  an 
instruction  sheet,  a  biographical  data  sheet,  and  the  necessary  number  of 
pages  of  job  task  statements. 

The  utilization  of  simply-structured  job  task  statements  as  the  baais  of 
the  questionnaire  eliminates  a  degree  of  generality  by  pinpointing 
specific  learning  elements.  The  respondents  are  asked  to  rate  the  job 
tasks  in  two  catagories:  Frequency  of  Performance  of  specific  tasks  in 
the  field  and  Adequacy  of  School  Training  for  the  task.  The  response  to 
Frequency  is  clearly  objective;  some  element  of  subjectivity  is  apparent 
in  the  Adequacy  rating,  however  since  it  is  evaluation  (opinion)  wj.ich  is 
sought,  this  is  vital. 

The  number  of  tasks,  categorized  m  major  ARF.AS  of  performance,  runs 
high.  In  all  surveys,  tasks  averaged  10S.  This  great  number  of  tasks  is 
deceptive  however,  fot  each  one  is  composed  of  a  simple  statement, 
containing  only  one  duty;  one  thought.  A  judgment  is  concise  and 
succinct  because  there  is  only  one  issue  to  be  evaluated.  Since  there  is 
10  delineatirif  selectivity  involved  in  the  mental  process,  the  time 
involved  to  weigh  each  task  is  short. 

FF.E3B..CK  METHOD 

This  feedback  method  provides  a  comparative  study  between  choices 
selected  by  graduates  and  their  Supervisors.  It  provides  information  as 
to  (1)  how  the  graduate  views  his  job  based  on  task  performance 
frequency;  (2)  how  adequately  he  feels  he  has  been  trained  to  perform 
each  task;  (3)  how  his  Supervisor  views  the  man's  job  task  performance 
frequency,  and  (A)  how  well  the  Supervisor  feels  the  man  has  been  tiaiaed 
in  school  to  perfoivn  each  task  effectively.  Such  response  enables 
determinations  as  to  degrees  of  under-training,  adequacy  of  training,  or 
over-training.  The  biographical  data  sheets  provide  additional  in-put 
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for  final  analyses  of  the  responses.  They  indicate  the  Supervisor's 
rate,  the  graduate's  assigment,  the  length  of  time  the  graduate  has  been 
on  the  job,  any  intervening  training  he  might  have  received,  or  any 
activities  which  might  have  delayed  commencement  of  duty. 

The  Commanding  Officer's  critique  offers  a  verification  of  the 
Supervisor’s  opinion  of  performance  and  adequacy. 

The  open-ended  question  for  "missing  elements  of  training"  provides  an 
opportunity  for  graduates  and  Supervisors  to  objectively  enumerate 
training  needs  not  being  met  and  which  appear  to  them  lo  be  essential  for 
productive  field  performance. 

RESULTS;  INDICATIONS;  IMPLICATIONS 

The  decision  was  made  to  initially  survey  each  of  the  schools  with 
follow-up  surveys  every  six  months.  Te  date,  all  Class  A  Schools  and 
subordinate  tracks  have  been  surveyed.  The  survey  population  consisted 
of  678  graduates,  their  Supervisors  and  Commanding  Off icers .  Percentage 
of  returns  averaged  59%. 

A  first  analysis  indicates  major  AREAS  of  Over-training,  Adequacy  of 
Training  and  Under-Training.  This  gives  an  immediate  indication  of  over¬ 
all  Adequacy  or  Inadequacy  of  training. 

Within  these  major  AREAS ,  SPECIFIC  task  are  then  examined  in  light  of 
training  requirements. 

Once  over-all  training  effectiveness  is  determined,  an  in-depth  study  of 
both  these  categories,  AREAS  and  SPECIFICS ,  reveals  many  aspects  for 
consideration.  The  following  are  some  ramifications  of  the  surveys,  it 
is  interesting  to  note  that  even  though  the  same  evaluative  vehicle  was 
used,  feed-back  was  different,  and  idiosyncratic  of  the  individual 
school  or  track.  A  review  of  major  AREAS  shows  generally  that: 

(1)  if,  as  in  the  Electronics  survey,  the  r.ajority  of  responses 
indicates  Adequate  Training  (most  tesks  are  performed  very  often  and 
trained  well),  then  the  over-all  implication  is  that  the  newly  instituted 
ET  program  is  indeed  accomplishing  its  mission.  And  this  is  particularly 
significant  as  an  evaluation  of  training  effectiveness  sin'e,  as 
p«eviously  discussed,  the  Electronics  Class  "A"  curriculum  at  Governors 
Island  is  totally  a  "hands-on”  training  experience.  The  change  of  policy 
to  need  tc  know,  from  nice  to  know,  has  succeeded,  overcoming  original 
antipathy.  The  ET  program  is  teaching  to  CG-311,  and  concomitantly,  CG- 
311  does  define  what  an  entry-level  technician  needs  to  be  able  tc  do  in 
the  field.  It  is  interesting  to  note  also,  as  supportive  data  to  the 
field  survey  results,  that  students  trained  in  the  new  curriculum  faired 
better  on  the  Coast  Guard  servicewide  exam,  which  rank  orders  qualified 
candidates  for  advancement,  than  those  previously  trained  under  the 
older  system  in  four  out  of  six  test  areas  (Administration,  Safety, 
Electricity  8  ..lectrouics ,  Solid  State  Theory),  and  less  than  1 
percentage  point  below  previously  trained  students  in  the  other  two  areas 
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(Test  Equipment  and  Technical  Maintenance),  eveu  though  the  old 
curriculum  was  heavily  loaded  with  "theory"  in  contrast  to  the  hands-on 
application  in  the  new  curriculum. 

(2)  if,  as  ia  tne  Gunnersmat-  and  Damage  Controlman  surveys,  results 
indicated  '.ome  areas  of  gross  over-training,  investigation  revealed  that 
graduates'  and  Supervisors'  rating  c i  frequency  of  performance  of  task 
and  adequacy  of  training  ratings  vary  according  to  duty  station.  A 
Dnmage  Controlman  stationed  on  land  does  not  perform  tasks  related  to 
anchor  windlasses  and  anchoring  equipment,  for  example.  Consequently, 
he  rates  these  tasks  as  over-trained  and  suggests  eliminating  or  drasti¬ 
cally  reducing  training.  However,  the  odds  are,  if  he  remains  in  the 
Coast  Guard,  he  will  be  assigned  to  different  billetc  where  he  must  use 
these  skills.  Conversely,  a  Damage  Controlman  assigned  to  a  floating 
unit  would  rate  woodworking  and  buildin**  maintenance  similarly. 


(3)  if,  as  in  the  Gunnorsiaate  Survey  results,  a  graduate  is  technically 
trained  for  a  shipboard  station  and  he  is  assigned  to  e  shore  station, 
his  evaluation  of  training  will  be  affected.  If  the  facilities  on  which 
he  was  trained  are  not  available  for  use,  hir  responses  will  fall  into 
the  over-trained  category.  However,  it  must  ha  borne  in  mind  that  at  a 
futuie  date  he  may  be  transferred  to  a  duty  station  where  the  facilities 
art-  available,  and  at  that  time,  he  must  have  had  some  "overtraining"  or 
his  recall  will  be  insufficient  without  having  additional  on-the-job 
training.  Also  criticality  of  tasks  must  oe  considered.  In  the  event,  of 
a  national  emergency,  that  which  is  considered  "over-training"  in  a 
peace-time  Coast  Guard  might  readily  beco-v*  "under-training". 

Reference  to  SPECIFIC  tasks  indicated  that,  in  ET  training,  the  term 
"THEORY"  which  had  been  the  "catch-all"  of  needs,  disappeared,  and 
specificity  of  training  requirements  pinpointed  actua1  additional 
training  requirements  in  the  revised  £)  pr^g-am  and  the  three  tracks. 
One  important  adjustment  to  Electronics  Training  which  resulted  from 
this  definition  of  needs  was  the  addition  of  maintenance  of 
Communications  equipment  to  the  Loran  Track  curriculum.  For  an 
Electronics  Technician  at  an  isolated  I.oian  Station,  knowledgability  of 
Communication  equipment  maintenance  was  vital.  The  practicality  was 
evaluated  and  the  curriculum  expanded.  Once  correlated  with  the 
graduate's  assigned  duty  station,  ibe  surveys  indicate  which  particular 
training  needs  requite  strengthening  as  the  thrust  of  his  technical 
responsibilities  at  that  station  vary  from  the  core  program  at  the 
Training  Center.  They  indicate  a  possibility  of  more  accurately  teaching 
to  placemen*.  They  also  supply  information  as  to  what  a  giaduate  needs 
when,  even  though  *u  apprentice,  he  is  assigned  to  a  billet  requiring  a 
much  sore  experienced  men  (for  example,  assigning  an  E-4  to  a  small  ship 
wherein  he  alonr  is  responsible  for  advanced  as  well  as  elementary 
duties) . 
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SPECIFICS,  as  well  as  AREAS  studied,  also  provide  in-put  for  review  of 
"in-house"  Training  Center  testing.  If  categories  are  substantially 
weak  in  the  field,  perhaps  "in-house"  testing  was  an  inadequate 
predictor.  Perhaps  the  tests  were  not  accurately  measuring  the  imp's 
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progress  as  he  was  being  trained  in  particular  skills.  Perhaps  a  re- 
evaluation  of  testing  methods  is  suggested. 

A  study  of  the  "MISSING  TRAINING  ELEMENTS"  question,  which  often  expands 
itself  to  encompass  aspects  not  strictly  considered  training  needs, 
provides  important  feed-back  for  Training  Center  -  Field  communication 
and  inter-play. 

The  request  for  more  leadership  training  appeared  thematically 
throughout  one  school  survey.  This  is  a  problem  which  cannot  be 
primarily  addressed  as  a  training  need  in  our  system.  The  sole  mission 
of  CG  Training  Center  Governors  Island  is  to  produce  an  entry-level 
technician  -  and  leadership  is  not  indigenous  in  apprenticeship, 
although  it  is  inherent  in  a  military  ethical  standard.  It  is 
interesting,  however,  to  try  to  discover  a  way  to  infuse  this  intangible 
moral  element  into  vocational  training  without  actually  teaching  it,  as 
there  is  no  provision  for  such  instruction  either  in  time  or  in  the 
curriculum.  Yet  it  is  essentia,  for  advancement. 

This  area  is  now  being  invent i gated.  We  plan  to  use  as  an  appendix  to 
future  field  surveys  a  Military  Performance  Questionnaire.  Both  the 
graduate  and  his  supervisor  will  be  asked  to  evaluate  how  frequently 
tasks  relative  to  military  aspects  of  the  Coast  Guard  environment  are 
performed  in  the  course  of  a  graduate’s  initial  duty  tour  and  how  ade¬ 
quately  the  emphasis  was  placed  on  tnis  aspect  during  technical  training. 
While  this  segment  of  militia  is  not  taught  per  sc  at  the  Training 
Center,  it  must  be  learned  through  emulation.  As  instruction  must  be 
effective  for  productivity,  so  must  be  the  Instructors,  for  their  impact 
must  not  only  be  on  a  technically  trained  man,  but  on  a  military  techni¬ 
cally  trained  man. 

The  utilization  of  the  .nitial  job  task  analysis  field  survey  provided  an 
evaluation  that  training  in  our  Class  "A"  Schools  is  adequate  and 
effective.  Areas  of  change  which  were  indicated  have  been  impleiaented. 
The  appearance  is  that  at  the  present  Ume  there  has  been  achieved  the 
delicate  balance  of  productivity:  the  product  is  adequate  for  the  degree 
of  training,  the  amount  of  time  expended  and  the  training  cost  incurred. 
However,  the  politics  of  experience  has  showed  there  is  rarely  a 
constancy  in  a  changing  society  and  the  Coast  Guard  is  no  exception.  The 
advent  of  a  lowered  entry  score  for  school  admission,  albeit  exit 
standards  remaining  the  same  and  what  this  implies,  calls  for  another 
evaluative  examination  as  does  curriculum  changes  implemented  after  the 
initial  field  surveys.  A  process  of  periodic  re-surveys,  to  measure 
training  adequacy  must  be  the  barometer  for  productivity. 
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CAN  TRAINING  PRODUCTIVITY  BE  MEASURED? 


Productivity  Measurement  requires  an  analysis  of  input,  process,  and 
output.  (See  Figure  i;.  On  the  input  side  there  are:  (I)  Man-power 
needs  of  the  field;  (2)  Job-task  inventories  (JTI);  (3)  Established  job 
performance  standards;  (4)  School  entry  standards.  The  manpower  needs 
translate  into  the  number  of  students  which  will  be  enrolled  in  training 
during  any  one  period.  JTIs  and  Job  Performance  Standards  dictate  the 
training  objectives  which  must  be  met. 

The  training  process  consists  basically  of  methodology  of  training, 
length  of  training,  and  resources,  manpower  and  funding. 


Figuro  1 


The  measurable  output  consists  of  the  quantity  and  the  quality  of 
graduates.  These,  too,  have  cyclical  effect  on  input,  where  there  is 
shortfall  or  overabundance  of  either. 

If  aspects  of  any  three  change,  the  productivity  will  change,  either  in 
quantitative  or  qualitative  results.  The  ability  to  measure 
effectiveness  is  the  key  to  measuring  the  productivity  of  the  training 
system  as  we  have  seen.  But  can  it  be  done?  We  believe  so,  although  we 
have  not  yet  fourd  the  right  formula  to  fit  our  needs. 

At  Training  Center  Governors  Island,  we  have  seen  that  by  converting  our 
various  curricula  to  performance-based  instruction,  we  can  reduce  the 
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length  of  training  and  still  maintain  effectiveness.  The  resulting 
personnel  cost  decrease  seems  to  have  balanced  the  increase  in  capital 
costs  required  in  a  total  performance  based  system.  But  what  happens  if 
changes  external  to  our  control  occur?  Any  of  the  input  variables  may 
change  which  will  have  a  change  impact  on  the  training  process  if  the 
output  is  to  remain  constant.  Requirements  for  change  in  the  process 
will  also  impact  on  the  output. 

For  example,  last  year  the  Coast  Guard  reduced  it’s  class  "A"  School 
entrance  standards  for  all  but  a  few  ratings.  This  move  was  intended  to 
provide  a  greater  opportunity  for  more  people  to  be  able  to  learn  a 
technical  skill  and  progress  toward  a  Coast  Gu.'rd  career.  Although 
school  entrance  standards  have  been  lowered,  job  entrance  standards 
remain  unchanged.  This  means  that  some  aspect  of  the  training  process 
has  to  be  adjusted  in  order  to  maintain  the  necessary  quantity  and 
quality  of  graduates.  Frankly,  we  don’t  know  what  the  overall  impact  is, 
although  we  have  accepted  the  fact  that  the  length  of  training  will 
probably  increase  and  that  the  attrition  rate  may  rise. 

There  is  also  evidence  that  the  Coast.  Guard  may  uncergo  a  drastic 
decrease  in  support  billets  over  the  next  few  years.  That  indicates  a 
cut  in  training  resources,  even  though  the  input  and  output  needs  will 
remain  unchanged.  There  will  certainly  be  an  impact  on  our  training 
productivity  even  though  other  in-house  adjustments  to  either 
methodology  or  length  of  training  are  made. 

Some  of  our  ratings  arc  currently  undergoing  job  task  inventory.  We 
welcome  that.  But  a«  the  job  tasks  may  change  so  mt?st  our  training 
processes.  Zero-bascl  budgeting  requires  the  necessity  to  measure 
productivity  of  every  program  conducted.  If  we  were  to  be  asked  what  we 
could  produce  if  our  FY79  budget  was  cut  80X,  what  could  we  answer?  That 
we  would  produce  fcO%  fewer  qualified  graduates,  or  the  same  number,  but 
each  only  80X  qualified?  We  need  a  more  definitive  method  to  provide 
some  definitive  answers.  We  not  only  need  to  quantify  our  current 
productivity,  but  wc  need  to  be  able  to  see  how  a  variety  of  changes  will 
impact  on  that. 

PRODUCTIVITY  BENCHMARKS 

The  first  thing  we  have  to  do,  after  determining  the  effectiveness  of  our 
current  training,  is  to  establish  some  realistic  benchmarks  about  which 
ve  can  make  realistic  judgements.  The  first  benchmark  necessary  is  the 
job-performance  standard.  In  our  system,  job  performance  standards  are 
established  by  the  program  manager  or  subject-matter  expert.  The 
standards  have  to  be  translated  into  quantitative  vale’s  so  that 
comparisons  can  be  made  with  in-training  performance  test  standards.  If 
valid,  performance  test  results  should  be  relatively  accurate  predictors 
of  on  the  job  performance.  In  our  field  survey  system,  a  verdict  of 
adequacy  (neither  over  nor  under-training)  would  confirm  that  our 
performance  test  standards  are  equivalent  to  job  performance  standards. 
For  cxa*q»le,  in  the  Electronics  Fundamentals  Section  of  the  ET 
curriculum,  field  surveys  showed  the  training  to  be  adequate.  3.900 


performance  test  scores  were  than  analyzed  to  determine  the  average 
scores  in  each  module.  These  average  scores  have  then  been  established 
as  the  quantitative  job  performance  benchmark  for  those  tasks  trained. 
To  make  this  benchmark  meaningful  in  productivity  measurement  a  relative 
cost  to  achieve  it  has  to  then  be  determined. 

Juran's  "Qual^y  of  Conformance  Model"  (See  Figure  2)  may  help  to 
establish  and  then  confirm  the  job  performance  benchmark  in  terms  of 
costs.  According  to  Juran,  (1)  the  more  one  strives  for  perfection  in  a 
product,  the  higher  will  be  the  cost  of  producing  one  good  unit  of  that 
product  relative  to  a  conformance  standard.  A  state  of  diminishing 
returns  will  be  eventually  approached.  On  the  other  hand  there  will  also 
be  costs  incurred  for  an  imperfect  product  (one  less  than  100%  perfect) , 
which  Juran  calls  "cost  of  failure".  The  cost  of  failure  increases  as 
the  degree  of  perfection  of  the  product  decreases  from  100%  perfect.  In 
essence,  high  costs  are  incurred  at  either  extreme  of  conformance  to  a 
standard.  Using  these  principles,  a  training  manager  should  be  able  to 
plot  the  cor.ts  of  successful  training  against  the  cost  of  failure  in  job- 
performance  to  determine  total  costs  relative  to  productivity. 
Conformance  or  quality  standards  based  on  job  performance  standards 
could  then  be  compared  with  total  quality  costs  and  adjusted  as  necessary 
to  achieve  optimization. 


(«)  QUALITY  CONTROL  HANDBOOK  by  J.  M.  JURAN. 
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For  example,  assume  that  a  conformance  standard  of  70  on  a  scale  from  0 
to  100  is  considered  as  qualifying  in  any  given  situation.  A  training 
cost  to  meet  that  standard  can  be  computed  based  on  existing  data.  An 
adequate  feedback  system  could  then  show  the  relative  cost  of  failure, 
that  is,  the  training  deficiency  in  the  range  of  71  to  ICO  on  the  same 
scale.  If  it  is  found  that  the  relative  cost  of  failure  is  excessively 
high,  then  apparently  the  standard  is  too  low  and  should  be  adjusted 
upward.  If  the  relative  cost  of  training  is  too  high,  then  the  standard 
should  be  adjusted  downward.  The  relative  cost  of  failure  will  be  an 
extremely  difficult  value  to  define  and  will  probably  have  to  be 
expressed  only  in  terms  of  a  percentage  of  under-training  determined  fro:a 
effectiveness  surverys. 


The  standard  would  then  fall  into  one  of  three  zones  described  by  Juran 
on  a  total  quality  cost  curve:  a  "zone  of  improvement"  (standard  is  too 
low),  a  "zone  of  perfection"  (standard  is  too  high),  and  an  "optimum 
zone"  (standard  is  just  right).  (See  Figure  3). 


Figure  3 


A  criticism  may  arise  tbal  a  training  nanager  who  applies  this  form  of 
quality  control  is  only  putting  out  a  mediocre  product.  3f  one  strives 
for  only  70%  of  a  quality  value  for  example,  then  that  criticism  is 
valid,  but  only  on  paper.  Our  experience  has  shown  that  if  the  quality 
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standard  is  realistically  based  on  the  best  data  available,  a  great 
portion  of  students  will  achieve  beyond  that  standard  provided  the 
motivational  environment  is  right  and  no  other  restraints  are  placed  upon 
them.  To  paraphrase  an  old  duplicate  bridge  axiom:  "Striving  for  the 
best  possible  result  will  result  in  failure  most  of  the  time.  Striving 
for  the  best  result  possible  will  result  in  success  most  of  the  time." 
Granted,  there  are  those  areas  where  perfection  in  training  is  an 
absolute  must  for  job  entry.  The  astronaut  training  program  is  an 
example.  Perfection  here  is  necessary  because  the  cost  of  failure  is  so 
great.  The  program,  manager  has  to  accept  this  cost  if  the  program  is  to 
continue.  In  our  experience  though,  that  condition  is  the  exception 
rather  than  the  rule. 

The  Juran  conformance  to  standard  model  can  also  be  used  to  establish  and 
confirm  a  realistic  pass-fail  threshold.  For  instance,  the  value  which 
is  represented  by  the  line  of  demarcation  between  the  optimum  zone  and 
the  zone  of  improvement  on  the  total  quality  cost  curve  might  be 
considered  to  be  a  realistic  pass-fail  threshold  once  the  quality 
standard  hus  been  determined. 

The  next  benchmark  for  measuring  training  productivity  is  a  realistic 
attrition  rate.  This  certainly  relates  to  an  established  conformance 
standard.  By  applying  Juran' s  principle  here,  as  well,  an  optimum  zone 
can  be  determined.  Since  our  task  as  training  managers  is  to  provide  a 
given  quantity  of  qualified  graduates  to  meet  field  needs,  the  ideal 
attrition  rate  is  zero.  But  from  a  productivity  viewpoint  that  is  not 
realistic.  If  the  conformance  standard  is  set  so  low  that  every  student 
passes,  there  is  a  risk  of  higher  costs  of  failure  in  job  performance. 
Conversely,  if  the  standard  is  sot  too  high,  the  quality  needs  will  be 
met  but  the  quantity  probably  won’t  be.  Therefore,  the  pass-fail 
threshold  may  have  to  be  adjusted  accordingly,  even  though  slightly 
higher  than  optimum  cost  of  failure  may  result.  This  circumstance  may  be 
offset  by  additional  follow-on  OJT  to  reduce  cost  of  failure  somewhat. 

Cost  per  student  trained  is  the  third  benchmark.  Student  time, 
instructor  time,  training  equipment,  and  operating  costs  are  elements 
which  make  up  this  benchmark.  It  is  the  probably  the  most  critical  of 
the  three  because  productivity  in  any  management  area  has  to  be  related 
to  cost  as  can  be  seen  in  Juran's  model.  It  interrelates  directly  with 
the  other  two  benchmarks  because  they  are  expressed  in  terras  of  quality 
and  quantity  costs.  These  three  benchmarks,  job  performance  standards,  a 
realistic  attrition  rate,  and  cost  per  student  trained  are  the  basic  keys 
to  training  productivity  measurement.  We  cal)  them  benchmarks  because 
they  are  just  that:  starting  points  on  which  we  can  base  a  training 
productivity  measurement.  The  benchmarks  must  have  some  flexibility  and 
should  be  adjusted  as  necessary  to  eventually  "bracket  in"  on  a  firm 
standard.  F.ach,  of  course,  must  be  confirmed  by  results  of  training 
effectiveness  surveys.  Once  established,  these  benchmarks  should 
provide  storting  points  to  enable  us  to  solve  some  of  the  problems  which 
were  discussed  above.  Please  keep  in  mind  rhat  we  have  not  yet  cobm*  up 
with  a  workable  formula,  but  we  are  continuing  our  research.  We  will 
explain  generally  how  such  a  productivity  measurement  system  may  be 
appl ied . 


APPLICATION 

Consider  s  J»e  of  the  problems  meutionrd  previously.  The  first  of  these 
is  the  effect  of  reduced  school-entry  standards  with  no  offsetting 
reduction  in  job-entry  standards.  With  reference  to  the  input-process- 
output  model,  (Figure  1)  the  only  adjustment  which  can  be  made  is  in  the 
process.  But  first  we  have  to  see  what  our  current  productivity  is 
before  we  can  measure  the  effect  of  change.  Let's  look  at  our  ET 
curriculum  as  the  best  example. 

Based  on  field  surveys  conducted  prior  to  this  change,  we  know  that  our 
current  ET  training  process  is  meeting  both  of  the  field  quantity  and 
quality  needs.  We  also  know  the  number  of  student  hours,  instructor 
hours,  cost  of  training  equipment,  and  operating  costs  involved  to  meet 
those  needs.  From  analysis,  we  have  established  a  meaningful  performance 
standard  and  pass-fail  threshold.  Our  attrition  rate  has  leveled  and  has 
remained  fairly  constant  at  5%.  We  also  know  the  training  cost  per 
student.  These  are  our  initial  benchmarks. 

Let's  then  consider  what  the  results  of  lower  entry  standards  might  be. 
We  suspect  that  it  is  a  greater  number  of  students  with  some  reduced 
learning  abilities,  probably  lower  reading  and  mathematic  skills.  Next, 
consider  what  the  impact  on  productivity  will  b;  if  no  changes  in  process 
are  introduced.  Without  reducing  job  performance  standards  (benchmark 
1),  it  is  obvious  that,  a  higher  attrition  rate  (benchmark  2)  will  result. 
Consequently,  we  will  fail  to  meet  the  number  of  graduates  required  by 
the  field.  If  performance  test  standards  are  reduced  in  order  to 
maintain  the  lower  attrition  rate,  the  risk  of  on-the-job  failure  will 
result.  Therefore,  a  change  in  process  is  mandated.  W-  could  change  the 
methodology,  increase  the  length  of  training,  cr  increase  resources. 
Each  will  result  in  some  cost  increase  (benchmark  3),  thereby  effecting 
productivity.  We  believe  that  the  current  methodology,  (performance- 
based  instruction)  is  ideal  for  this  situation,  although  we  are 
considering  adding  some  self-study  enrichment  programs  which  may  assist. 
An  increase  in  resources,  except  for  student  time,  is  probably  not  the 
answer.  Therefore,  the  logical  adjustment  is  to  increase  the  length  of 
training.  In  essence,  the  slower  learning  student  will  be  given  more 
time  to  learn.  This  will  result  in  a  proportional  increase  in  cost  per 
student  trained. 

What  does  this  mean  in  terms  of  productivity?  A  greater  course  length 
will  result  in  a  relatively  higher  training  process  cost.  Is  that 
productive?  If  it  meets  the  objective  of  the  school-entry  standard 
teduction  (to  give  a  greater  opportunity  for  more  people  to  learn  a 
skill)  and  still  satisfies  field  needs,  the  answer  is  yea.  The  next  step 
is  to  conduct  a  field  survey  to  determine  the  job  performance  of  those 
graduates  who  had  entered  training  under  the  lower  entry  standards.  This 
will  serve  to  measure  the  effects  of  the  reduced  school-entry  standards 
on  job  performance,  if  any,  to  confirm  that  the  adjusted  training  was 
effective,  and  to  confirm  the  adjusted  cost  of  training  benchmark. 
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Another  problem  which  we  expect  to  face  soon  is  a  reduction  in  instructor 
billets.  We  don’t  know  how  many  billets  we  might  lose  but  we  hope  to  be 
able  to  make  some  realistic  projections  of  it’s  impact.  If  our 
envisioned  productivity  benchmarks  are  valid  and  confirmed  we  should  be 
able  to  pinpoint  those  courses  in  which  we  can  effect  the  resource 
reduction  and  what  impact  that  will  have.  It  may  require  either  change 
in  methodology  or  in  course  length  or  in  perhaps  some  combination  of 
both.  If  it  appears  that  these  changes  may  result  in  a  reduction  of 
training  quality  or  quantity  of  graduates,  then  that  too  should  be 
evident.  In  any  event,  the  impact  of  a  resulting  cost  of  training 
decrease  which  should  result  from  reduction  of  resources  can  be  revealed. 
Is  the  sacrifice  of  some  degree  of  quantity  or  quality  of  graduates  in 
favor  of  lower  cost  of  training  productive?  Perhaps  it  is,  if  the 
relative  cost  of  failure  on  the  Job  is  not  too  great. 

Whet  about  zero-based  budgeting?  A  productivity  measurement  system  is  an 
absolute  must.  When  a  manager  can  show  in  quantitative  terms  the  effects 
of  resource  changes  on  productivity,  his  budget  estimates  have  a  firm 
foundation.  Many  of  us  at  our  levels  have  not  had  to  zero-base  our 
budgets  yet.  But  it  will  eventually  come  down  to  us.  We  will  have  to 
analyze  every  element  of  our  training  programs  to  justify  how  we  are 
conducting  them.  Well  defined  standards,  a  realistic  attrition  rate,  and 
specific  cost  data  related  to  these  will  aid  us  in  that  justification. 

SUMMARY 

In  this  paper  we  have  presented  issues  which  are  undoubtly  not  new  in 
management.  Nor  is  our  approach  to  resolving  them  necessarily  unique. 
Perhaps  it  is  innovative  only  in  our  eyes  -  the  eyes  of  short-term 
training  managers  who  will  soon  move  ontc  some  other  field  of  endeavor. 
Perhaps  naively,  we  feel  that  ;t  is  the  most  practical  approach  to  fit 
our  needs  at  Coast  Guard  Training  Center  Governors  Island. 

The  problem  presented  is  manifold.  Is  our  training  effective?  Is  our 
training  system  productive?  What  are  the  impacts  of  a  variety  of  changes 
which  may  occur?  How  can  we  adjust  to  those  changes  and  still  be  able  to 
meet  the  quanity  and  quality  of  graduates  needed  by  the  field?  What  are 
our  benchmarks? 

We  have  so  far  found  only  a  partial  solution  to  this  complex  problem. 
More  work  is  yet  to  follow.  We  have  found  that  we  can  measure  our 
training  effectiveness.  This  measurement  allows  us  to  validate  per¬ 
formance  testing  and  to  insure  that  these  tests  adequately  predict  on  the 
job  performance.  Through  analysis  of  the  feedback  data  and  performance 
test  results,  a  job-performance  bench  mark  can  be  established. 

Further,  through  the  same  analysis,  a  realistic  pass-fail  threshold  is 
established.  This  value  allows  us  to  "bracket  in"  on  a  reasonable 
attrition  rate  -  one  the?  insures  that  the  quanity  of  graduates  is 
responsive  to  field  needs  and  at  the  same  time  minimizes  the  cost  of 
failure  on  the  job. 
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Finally,  we  can  develop  a  cost  per  student  trained  benchmark  which 
encompasses  all  costs  involved  in  the  training  process.  This  is  the 
pivotal  benchmark  because  it  allows  us  to  relate  the  quality  of  training 
and  quantity  of  graduates  to  a  cost  value.  Each  ot  these  benchmarks  must 
be  considered  in  the  productivity  measurement  system  which  we  envision. 
The  next  step  will  be  to  develop  a  formula  which  can  be  applied  to  answer 
any  of  the  questions  posed  above.  We  think  we  are  on  the  right  track,  but 
it  is  too  early  to  tell. 

The  main  rationale  behind  this  paper  has  been  not  only  to  explain  what  we 
have  done  and  what  we  hope  to  do,  but  to  appeal  to  the  military  training 
community  for  assistance.  We  would  greatly  appreciate  any  feedback  we 
can  get. 


''Ml  ssh^¥'-  i  h  &$&£•  ''M%‘& 
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FEMALE  MILITARY  PERSONNEL  UTILIZATION  AND  COST  EFFECTIVENESS 


Traditionally,  the  military  establishment  of  the  United  States,  has 
been  a  masculine  domain.  For  the  greatest  part  of  this  nation's  history, 
women  have  not  been  permitted  to  serve  in  a  military  uniform  except  as 
nurses.  Indeed,  serious  Interest  in  defining  women's  role  In  the  armed 
forces  did  not  awaken  until  World  War  II,  and  it  was  net  until  1948 
that  women  achieved  permanent  military  status.  Thus,  the  formal 
association  of  women  with  the  armed  services  is  a  relatively  recent 
phenomenon.  The  changing  role  of  women  in  the  military  establishment 
of  the  United  States  In  large  part  has  mirrored  their  changing  role  in 
American  society;  It  has  recently  been  Influenced  by  military  necessity. 

HISTORY  AND  DEVELOPMENT :  A  Brief  Overview  of  Women  in  the  Military! 

Prior  to  World  II,  the  United  States  Armed  forces  were  almost  exclusively 
male.  While  legendary  women  warriors  such  as  DEBORA  SAMPSON  ("Robert 
Shurtleff")  In  the  Revoluntionary  War,  LUCY  BREWER  ("George  Baker")  in 
the  War  of  1812  and  MOLLY  PITCHER  did  exist  these  were  indeed  exceptions 
to  the  rule.  Other  women,  as  civilians,  assisted  the  military  in  such 
capacities  as  nurses,  cooks,  laundresses  and  other  acceptable  feminine 
pursuits. 


^  The  major  portion  of  the  background  section  is  from  the  Brookings 
Institution  study  Women  and  the  Military,  Chapter  2. 
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In  the  wake  of  nineteenth  century  industrialism,  American  Women  developed 
skills  that  were  to  become  increasingly  relevant  to  the  military.  In 
fact.  v?omcn  dominated  some  occupations  {for  example,,  secretaries  and 
telephone  operators)  which,  with  changes  in  military  technology  and 
organization,  had  come  into  greater  demand  by  the  armed  services  by  World 
War  I. 

During  World  War  I  some  13,000  women  were  enlisted  in  the  Navy  Department, 
However,  after  World  War  I  they  were  demobilized.  Thus,  the  few 
remaining  women  in  the  armed  forces  were  found  in  the  nurses  corps. 

Between  the  two  World  Wars  little  interest  1r.  women  In  the  military 
existed.  Although  two  plans  1  were  designed  that  addressed  women  In  the 
military,  neither  plan  received  sufficient  support. 

World  War  II  can  justifiably  be  viewed  as  the  turning  point  In  the  history 
of  women  s  quest  for  military  status.  Large  numbers  were  involved  —  a 
total  of  about  300,000  women  served  in  the  four  military  services.  And 
although  the  vest  majority  were  employed  in  health  care,  administration, 
and  communications,  women  demonstrated  their  competence  In  virtually  every 
occupation  outside  of  direct  combat,  including  airplane  mechanic,  parachute 
rigger,  gunnery  instructor,  air  traffic  controller,  ard  naval  air  navigator. 
It  is  also  worth  noting  that  seme  800  women  served  as  Women's  Air  Forces 
Service  Pilots  (WASPS).  Although  never  accorded  full  military  status, 
they  ferried  all  tyres  of  military  airplanes,  including  comoat  aircraft. 

Women's  role  in  the  Second  World  War  was  far  more  significant  than  Is 
suggested  by  the  brief  overview  presented  here.  Perhaps  the  ultimate 
compliment  paid  to  the  American  women  who  served  was  offered  by  Albert 
Speer,  Adolph  Hitler's  weapons  production  chief,  to  now-retired 
Lieutenant  General  Ira  C.  Baker,  an  Army  Air  Force  commander  In  Europe 
during  World  War  II: 

How  wise  you  were  to  bring  your  women  into 
your  military  and  your  labor  force.  Had  we 
done  that  initially,  as  you  did,  it  could  well 
have  affected  the  whole  course  of  the  war.  We 
would  have  found  out,  as  you  did,  that  women 
were  equally  effective,  and  for  some  skills, 
superior  to  males. 

^One  plan  was  developed  under  the  auspices  of  Anita  Phipps,  the  Army's 
Director  of  WonwVs  Relations;  the  other  plan  was  directed  by  Major 
£.S.  Hugnes. 


i 
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With  the  conclusion  of  World  War  II  came  a  large  and  rapid  demobilization. 
Further,  the  authorization  for  the  WASPS  was  ret  to  1-pse  in  1948.  Therefore 
the  women  who  decided  to  remain  on  active  duty  after  the  war  were  in  a 
precarious  position. 

In  194S  the  Women’s  Armed  Forces  Integration  Act  was  passed.  Hence, 
women’s  role  was  clarified;  they  were  given  the  opportunity  to  pursue 
a  permanent  military  career.  Many  factors  precipitated  the  enactment  of 
this  legislation;  one  major  concern  was  that  the  armed  services,  soon 

to  be  without  benefit  of  conscription,  would  have  difficulty  in  meetlno 
their  recruitment  needs  by  voluntary  means.  Though  signifying  a  major 
breakthrough  for  women,  the  1948  legislation  also  sowed  the  seeds  of  ,ex 
discrimination  that  was  to  persist  for  two  decades.  Specifically,  the 
act  imposed  some  major  limitations  on  women  in  the  areas  of  recruitment, 
career  opportunity  and  dependency  status. 

With  the  decision  In  1970  to  end  the  draft,  the  United  States  embarked 
on  a  venture  unprecedented  In  any  nation's  history:  to  field  a  military 
force  over  two  million  strong  relying  solely  on  volunteers.  Could  enough 
men  be  found,  willing  and  able  to  volunteer,  wl:  :out  exorbitant  additional 
costs,  and  without  compromising  the  quality  of  military  manpower? 

The  combined  Impact  of  the  end  of  the  draft,  the  Equal  Rights  Amendment 
debate,  the  feminist  litigation  and  the  changing  social  attitudes  affected 
the  women  In  the  military  in  five  major  areas: 

•  The  number  of  women  in  the  military  has  increased 
substantially; 

•  Personnel  policies  have  Deen  changed  (e.g.  avIuHen  . 
training,  command  billets,  wider  job  range  etc.J; 

•  The  number  of  specialties  open  to  women  has  Increased 
dramatically; 

•  The  proportion  of  women  assigned  to  nontraditional  jobs 
has  increased;  and 

4  The  influx  of  women  into  the  military  has  influenced 
the  soclo-econcmic  composition  of  the  enlisted  ranks. 

Inspite  of  the  increased  numbers  of  women  in  the  military  and  their 
expanded  roles,  there  still  exist  factors  that  Inhibit  the  maximum 
utilization  of  women  in  the  military. 


INHIBITING  FACTORS 

The  factors  that  inhibit  the  maximum  utilization  of  women  in  the  military 
may  be  grouped  Into  three  major  categories: 

legal  In  1969  the  Secretary  of  Defense  and  the  Chiefs  of  Staff  of  all 
the  Services  signed  the  Department's  Human  Goals  statement.  This 
statement  declared  that  the  defense  Department  would  strive  "to  make 
military  and  civilian  service  in  the  Department  of  Defense  a  model  of 
equal  opportunity  for  all  regardless  of  race,  sex,  creed  or  national 
origin  ...[  Commanders  Digest  15(8),  1974].  However,  some  laws  exist 
that  prevent  the  full  implementation  of  this  policy.  Specifically  legal 
restrictions  are  imposed  on  the  .military  by  Sections  6015  and  8549, 

Title  10  of  the  U.S.  Code.  Section  6015  prohibits  the  use  of  women  on 
Navy  vessels,  other  than  hospital  or  transport  ships.  Sections  8549 
prohibits  the  assignment  of  women  to  aircraft  engaged  in  combat  missions. 

■•CONTRARY  TO  WIDELY  HELD  BELIEFS,  the  major  restrictions  on  the  recruitment 
and  functions  assigned  to  women  in  the  United  States  military  establishment 

*ncoJ5orJtcd  1n  federal  law.  To  be  sire,  few  opportunities 
in  ei t  ier  U»e  Any  or  Air  Torce  would  be  closed  to  women  If  the  statutory 

*"ore  1  mitinnV!rI!  tLt^efUtJHZ?51?n  °f  m11*tary  wonen  were  literally  interpreted. 
..ore  limiting  are  the  set  of  policies  established  by  the  military  services 

themselves  based  on  their  o**n  Interpretations  of  the  national  will  as 
Xrn5to  »lnorr?„te°"'r0“'  T“9*""r>  theSe  and  po"c,es  '•’W* 

^e!-°Vnd>![il!ues,JI?a  nun,ber  of  fema*e  applicants  to  the  military 
demonstrates  their  willingness  to  serve  In  what  traditionally  has  been  a 

?rfra‘^Swever,/cluct!nce  appears  to  exist  to  select  r.on- 

ill  rlLviii  A  private  sector  sturfy  (Integration  of 

fggales._I.ntp  lale-Oricnted  Jobs:  Experience  of  certain  Public  Utility 

Ujlvcsuy  of  south  FlciTSiTHM)  suggests  similar  findings 
l.Jw.El  ***  scc,:or  *rtti  to  blue  collar  jobs.  It  was  hypothe- 

s,*e<\  W0ne'?  fre{l“cntly  had  the  opportunity  to  acquire  the  knowledae 
and  skills  required  for  non-traditional  white  collar  jobs,  but  dldnot  have 
the  opportunity  to  acquire  requisite  skills  for  blue  collar  Jobs.  Another 
factor  appeared  in  experiences  with  blue  collar  jobs.  Womandld  not 
^VefdSw?^h  Intrinsic  Interest  in  non-traditional  blue  collar  jobs  as  in  the 
non-traditional  white  collar  jobs. 


Martin  Oiokin  and  Shirley  J.  Wen  -  HUMAN  AND  THE  MILITARY 
The  Grookings  Institution,  1977  -'p.30 
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The  factors  of  tradition  ana  values,  tnougn  very  Oitflcult  to  measure* 
play  a  significant  role  in  obtaining  (or  hindering)  maximum  resource 
utilization  of  all  military  personnel.  Women's  role  in  the  Armed 
Forces  will  ultimately  depend  on  the  extent  to  which  National  Institutions 

-  social,  political,  judicial  and  military  -  are  willing  to  break  with 

their  past  — —  a  past  reflecting  a  persistant  pattern  of  male  dominance. 

Sex  Differences  An  often  Quoted  complaint  by  women's  groups  is  that  even 
the  most  sophisticated  and  sympathetic  male  managers  cannot  easily  shed 
deeply  ingrained  attitudes  regarding  the  proper  roles  of  women  in  society. 
These  attitudes  creep  into  many  decisions  in  which  a  person's  sex  Is  not 
an  obvious  factor.  Very  often  these  are  not  straightforward  decisions 
involving  favored  treatment  of  a  man  over  a  woman,  but  rather  decisions 
which  result  In  a  specific  treatment  of  women,  without  any  regard  to  haw 
a  man  would  be  treated  in  identical  circumstances. 


Various  traits  of  a  "perfect  man"  and  a  "perfect  woman"(have  been) 
established  through  sex-role  stereotypes.  Research  has  indicated  that 
there  is  a  positive  relationship  between  the  profile  of  mental  health  for 
an  adult  male  and  the  general  profile  for  a  healthy  adult,  sex  unspecified. 

"Healthy  adult  *emale  behaviors,  then  are  seen  as  less  socially  desIrabJe 
and  less  mentally  healthy  than  the  behaviors  of  healthy  adult  males".  ' 


The  general  increase  of  women  in  the  labor  force  and  their  inroads  into 
traditional  male  employments  has  led  to  a  gradual  change  in  society's 
attitude  concerning  women's  capabilities.  Cases  of  wcr.en  performing 

vf*  *n  £rad^j°na1  Mle  employments  are  no  longer  newsworthy. 

Yet  these  changes  have  filtered  slowly  through  to  the  armed  services.  The 
acceptance  of  women  in  the  military  has  been  overshadowed  by  the  controversy 
surrounding  the  possible  acceptance  of  women  in  combat. 2 


The  Issue  of  combat  is  indeed  a  complex  one  and  a  complete  discussion  of 
combat  Is  beyond  the  scope  of  this  paper.  However,  with  respect  to 
tradition,  George  Quester  has  noted  In  his  article  "Women  In  Combat" 
that  A  nation  forced  to  send  Its  women  into  combat  must  be  the  under¬ 
dog,  the  nation  that  has  been  threatened,  the  nation  that  cares  the  very 
most  about  the  justice  of  its  cause." 


:  Ibid 

c  Women  (and  Men)  in  the  U.S.  Army:  A  study  in  Optional  Utilization. 
Michael  John  Castle,  Naval  Postgraduate  School,  Monterey  Calif., 
December,  1976. 
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k  One  vivid  example  of  how  deeply  stereotypes  pervade  our  thinking  ts  described  In 

|  a  study  conducted  Ky  Inge  Broverman.  The  study  Involved  various  kinds  of 

|  psychotherapists,  psychiatrists,  psychologists  and  social  workers.  The 

I  Broverman  Investigators  developed  a  list  of  122  pairs  of  traits  (e.g.» 

•  "very  active  vs.  very  passive**),  and  gave  these  lists  to  the  therapists, 

f  The  therapists  then  employed  a  semantic  differential  format  and  broke  out 

i  each  trait  Into  a  seven  point  scale  (l.e.,  extremely  active,  moderately 

active,  slightly  active,  neutral,  slightly  passive,  etc.).  Having  completed 
this  breakout,  the  therapists  then  were  to  Indicate  on  the  122  scales 
where  they  thought  a  healthy  adult  would  fall. 

Next  the  therapists  were  to  go  through  the  scales  again  and  Indicate  where 
a  healthy  male  and  a  healthy  female  would  fall. 

It  Is  Important  to  note  that  the  therapists  were  not  asked  to  describe  men 
and  women  as  they  are  but  as  the  experts  thought  thtse  men  and  women  ouqht 
to  be.  "The  results  (with  no  significant  difference  from  male  and  female 
therapists)  were  quite  simple.  A  healthy  adult  is  a  healthy  male.**!"  The 
stereotype  is  very  pervasive  In  societal  thinking,  especially  when"  Considering 
the  qualities  and  capabilities  possessed  by  a  person  of  a  specific  sex  to 
perform  a  non-tradltlonal  job. 

It  is  important  to  examine  further  how  sex  differences  may  affect  Individual 
capabilities,  group  performance  and  Image  In  relation  to  military 
effectiveness.  Complex  factors  such  as  uisclpllne,  leadership,  training, 
societal  influence  and  group  relationships  all  bear  on  efficiency. 

It  is  clear  that  personnel  quality  (measured  by  educational  level  and 
general  Intelligence  and  apitude)  of  women  in  the  armed  forces  Improves 
the  overall  level.  Individual  differences  far  outweigh  sex  differences. 

That  Is,  the  differences  within  males  and  within  females  on  ability 
variables  are  far  greater  than  the  differences  between  the  means  of  the 
sexes.  A  second  finding  is  that  the*  literature  contains  many  contradictions 
relative  to  sex  differences.  Results  are  often  confounded  by  unconscious 
bias,  the  actual  task  used,  and  overgenera! izatlon  from  one  sample  to  the 
population.  Although  findings  tend  to  be  inconsistent  (see,  for  example, 

Bond  &  Binacke,  1961;  Haler,  1970;  Megargee,  1969),  It  has  been  reported 
that  males  and  females  are  similar  in  leadership  ability  (Day  &  Stogdlll, 

1972),  problem  solving  (Mathews,  1972),  cooperation  and  competition  fLIrtzman 
S  Wahba,  1972),  and  potential  capability  (Bass  et  a]_.,  1971).  However, 
women  are  not  always  accepted.  This  may  be  due"Tn  part  to  the  over- 


^  (L  till?  Guide  to.  ktocaenls- 


Gene  Marine,  Avon  Books,  New  York  1972 
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empnasls  of  group  differences,  and  not  to  evaluation  based  on  Individual 
qualifications.  The  resulting  resistance  often  has  been  manifested  as 
lower  pay  (U.S.  Department  of  Labor,  1971),  lov/er  positions  in  the 
organization  (Fidel  1,  1971),  and  a  general  under-utilization  of  women  in 
the  work  force  (Kootz,  1970). 

Research  suggests  that  as  long  as  women  are  in  the  minority,  men  fulfil 
their  own  need  to  pro.iect  the  male  image.  This  would  tend  to  isolate 
women,  keen  the  male  group  in  conflict  with  them,  and  thus  reduce  over 
all  grouo  productivity.  Integration  studies  at  Yale  and  Princeton 
Universities  found  generally  that,  while  the  ideal  mix  was  not  surpris¬ 
ingly  half  and  half,  social  problems  w^e  less  likely  to  develop  when 
the  ratio  of  men  to  women  was  lower  than  tnree  to  one.  Above  that 
threshold,  according  to  the  researchers,  some  women  tended  to  assume  a 
„  “superwoman"  role  and  to  make  more  male  friends  than  they  normally  would, 
while  the  men  tended  to  socially  reject  them  as  inferior.'  Interpretation 
of  the  determinants  of  behavior  in  combat  suggests  that  the  introduction 
of  women  into  fighting  organizations  or  seagoing  units  would  have  less 
disruptive  effects  on  solidarity. 


The  Image  abroad  would  probably  attract  little  attention  anywhere, unless 
it  resulted  in  a  dramatic  shift  in  total  sex  composition  or  unless  It  was 
accompanied  by  an  unprecedented  integration  of  women  into  U.S.  fighting 
units.  As  the  discussion  above  Indicates,  a  healthy  measure  of  uncertainty 
remains  about  greater  female  participation  would  affect  the  factors  in  sex 
differences  noted. 

COST  EFFECTIVENESS:  Through  increasing  Women  In  the  Military 

While  social  forces  must  be  considered  in  the  development  of  national 
policies  with  respect  to  women  in  the  military,  so  too  must  a  dollars-and- 
cents,  cost-effectiveness.  If  an  incr*»a<*  in  t.h*  nrn*v\rtlr«  of  wnm#>n 
in  the  military  could  lead  to  lov/er  costs  without  sacrificing  effectiveness, 
the  United  States  is  paying  more  than  is  necessary  to  field  Its  present 
military  forces.  The  financial  implications  associated  with  a  change  In 
the  mix  of  men  and  women  in  the  armed  forces  shows  not  only  that  the  cost 
differential  previously  associated  with  a  higher  expected  personnel  turn¬ 
over  rate  has  been  largely  eliminated,  but  also  that  if  one-time  costs 
for  so-srate  facilities  are  incurred,  they  would  be  offset,  at  least  in 

1  Jame*  H.  Thomas  and  Dirk  C.  Prather,  "Integration  of  Females 
into  a  Previously  All -.Male  Institution,  "Proceedings  of  the  Fifth 
Symposium  on  Psychology  in  Air  Force  (United  States  Air  Torce  Acadesny, 
Department  of  Behavioral  Sciences  and  Leadership,  April  1976),  pp.  100-01. 


the  short  run,  but  annual  savings  that  v/ould  result  from  supporting  a 
smaller  dependent  population.  Furthermore,  the  larger  costs  that  might 
accrue  because  women  would  be  less  productive  on  the  job  than  men  for  reasons 
of  pregnancy  or  illness  are  now  more  than  offset  by  the  greater  tendency 
of  men  to  have  disciplinary,  drug,  and  alcoholic  problems. 

All  in  all,  the  force  of  admittedly  scanty  evidence  is  the  changes  in  the 
s»x  composition  of  military  services  would  lead  to  changes  in  total  costs. 
Over  the  long  term,  however,  such  differences  in  costs  would  probably  narrow 
as  more  women  enter  the  military  services  -  and  one  that  is  likely  to 
attract  more  attention  -  is  the  prospect  of  being  able  to  maintain  desired 
quality  standards  among  volunteers. 

Generally  then  the  factors  in  cost  effectiveness  related  to  quality 
military  performance  are  both  social  and  military  and  require  further 
examination  and  careful  documentation  of  all  costs  over  a  specific  period 
of  time. 

The  importance  and  value  of  full  recognition  and  equality  for  women  in 
the  military  is  being  tested  at  all  levels,  in  both  traditional  and  non- 
traditional  roles.  For  the  most  part,  research  has  proceeded  by  focusing  on 
the  conceptual  analysis  of  a  single  variable.  Findings  from  both  the 
military  and  private  sector  research  on  career  success  have  established 
that  manipulating  one  variable  or  one  procedure  does  not  provide  career 
success  in  and  of  Itself.  In  fact,  what  is  typically  required  is  a  program 
that  incorporates  many  components  and  procedures  of  career  success  in  a 
systems  approach. 

Recent  studies  ^  identifying  differences  between  female  and  male  executives 
point  to  the  Importance  of  recognizing  p)  the  masculine  perspective  of 
supervisory  and  management  roles  and  (2)  the  nature  of  both  formal  and 
informal  relationships  among  corporate  executives.  This  research  is 
applicable  to  the  military  organizations  which  were  built  by  men  and  for 
men,  and  are  now  controlled  by  men.  The  forms,  rules  and  styles  of  behavior 
and  communication  among  corporate  supervisors,  managers  and  executives 
grew  from  a  distinctly  male  culture.  An  understanding  of  these  variables  is 
essential  to  career  success.  Recognition  of  the  male  culture  also  defines 
male  involvment  in  any  treatment  designed  to  improve  the  work  climate. 


The  Managerial  Woman,  Margaret  Hennig  and  Anne  Jardim,  Anchor  Press/ 
Doubleday,  N.Y.  1977 
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It  Is  acknowledged  that  there  are  complex  nroblems  ^ sting  relative  ^  the 
full  utilization  of  women  In  the  armed  services.  Some  of  these  p^°b1e,”!i  W1 
take  further  effort  to  resolve.  Hov:ever,  we  believe  that  even  within  the 
existing  constraints  greatly  Improved  utilization  of  women  in  the  armed 
services  is  possible. 


THE  PROBLEM:  fuller  Utilization  of.  Horen  vs.,  Military  Effectiveness,, 

The  multifaceted  problem  is  difficult  to  continue  exploring  "piecemeal''  given 
the  overall  Implications  for  military  effectiveness,  The  armed  services 
appear  reluctant  to  address  the  issue  at  this  time.1  Therefore  a  comprehensive 
experimental  approach  Is  proposed  In  which  the  military  system  may  be 
studied  and  evaluated  given  applied  training  treatment  groups  and  appropriate 
control  groups  .  which  will  identify  contributions  of  each  aspect  of  the  system 
to  the  results  at  at  least  three  intervals:  pretraining,  post  training  and 
inc  pre-combat,  and  post  combat. 


In  summary  the  problen  is  extremely  complex,  involving  a  cross-cut  of 
social  and  military  factors.  Two  powerful  social  forces  are  In  collision: 
the  push  for  women's  equal  rights  is  in  conflict  with  deeply  rooted  traditions 
that  question  the  propriety  of  women  under  arms.  That  the  body  politic 
supports  equal  opportunity  <n  principle  is  indisputable;  however,  the 
extent  to  which  people  will  accept  equality  in  nracj tee,  Including 
cownitting  women  to  combat,  is  less  clear.  FurtTer'tFe  budgetary  advant¬ 
ages  of  recruiting  more  women  are  at  variance  with  the  perceived  risks 
to  the  U.  S.  national  interest.  The  problem  then  is  Implementing  principle 
without  decreasing  military  effectiveness,  and  increasing  the  utilization 
of  women  personnel  resources  and  Increasing  overall  military  effectiveness 
and  maintaining  cost  effectiveness. 


1  Ibid,  p.  no 
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THE  RESEARCH  IS  FOCUSED  IN  THIS  QUESTION: 


Does  applied  training  (as  defined)  affect  the  optimum  integration  of 
women  into  the  armed  forces  and  show  cost  benefits  for  a  military 
effective  cl  iinate? 

THE  PROPOSED  RESEARCH  APPROACH {OVERVIEW 


The  approach  Is  a  basic  experimental -control  panel  survey  design.  It 
is  tailored  to  studying  the  effect  of  applied  training  on  military 
effectiveness  when  there  is  fuller  utilization  of  women  In  the  military 
organization.  The  design  of  the  study  Is  demonstrated  In  Figure  1.  It  Is 
Intended  to  cover  a  period  of  not  less  than  18  months.  The  projected 
study  would  consider  the  effect  of  the  specialized  training  on  twenty 
four  groups  (twelve  enlisted  and  twelve  officer  groups)  of  30  military 
personnel  each  drawn  from  Integrated  select  units,  where  at  least 
half  of  the  units  (treatment  and  control)  would  be  going  on  to  combat 
experience.1  The  research  would  study  the  effects  of  the  specialized 
training  In  the  all  male  group  all  female  group  and  the  two  mixed  groups 
of  both  male  and  female  personnel  as  well  as  the  effects  on  the  unit  from 
which  personnel  were  selected  at  each  site  (all  group  members  would  be 
selected  from  one  unit).  Criteria  for  selection,  classification  and 
assignment  of  personnel,  both  men  and  women,  would  be  Important  as  would 
be  the  measures  of  effectiveness  to  be  used.  At  each  experimental 
site  an  equivalent  control  group  would  be  Identified  and  tested.  The 
testing  would  consist  of; 


(1)  An  application  selection  test; 

(?)  Pretesting  for  base  lino  data; 

(3)  Interim  testino  during  applied  training  at  the  end  of  each 
phase  (3  times) 

(4)  Post  testing  for  growth  as  a  result  of  applied  training;  and 

(5)  Follow-up  testing  at  six  months  and  twelve  months. 

Officer  group  size  maybe  adapted  in  order  not  to  cause  fmnossiblc 
organizational  configuration  at  the  time.  See  p.  73  Uom.-n  and  the 
Military. 
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The  follow-up  testing  would  be  to  assess  the  internal tzati’on  of  new 
behaviors  and  attitudes  in  military  effectiveness  Including  results  in 
combat  experience,  special  training  on  performance  and  attitudes.  It  is 
reconwended  that  this  design  be  implemented  at  three  sites  where  there 
could  be  experimentally  arranged  units  of  20*,  30%  and  40%  inteoratlon  of 
wanen  into  the  military  organization.  The  deslon  would  use  1440  selects, 
720  of  whom  would  be  control  subjects.  Figure  2  shows  distribution 

at  one  site.  This  would  be  duplicated  at  the  other  sites  for  30*  and~40» 
integration. 


TREATMENT  CONTROL 
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Figure  2:  SAMPLE:  Applied  Training/Control  Groups  at  Site  1  {20* 
Integration  of  women.)' 


1  Site  2  (30*  of  women)  and  site  3  (40*  of  women)  integrated  Into  units. 
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TREATMENT  INTERVENTION:  Applied  Training.  Optimization  and  Adaptive 

Instruction. 


A  brief  description  of  the  applied  training  approach  Is  In  order  to 
separate  It  from  general  teaching  and  training  models  In  use. 

The  applied  training  approach  treats  the  teaching-learning  process 
phenomenon  as  a  whole,  recognizing  the  total  environment  and  specifically 
the  Interrelationships  between  the  content  (competence),  cognition  and 
Individual  differences.  It  Is  assumed  that  what  an  individual  subject  does 
and  can  learn  depends  Inextricably  on  what  is  already  known.  Thus  In  the 
learning  setting,  adaptive  Instruction,  defines  the  underlying  competence 
of  the  learner  and  builds  from  that  point  realizing  what  he/she  needs  to 
know  In  the  content  domain;  Identifying  and  developing  the  Instructional 
structures;  the  stimulus  experiences  (exercises,  simulations,  role  plavsh 
and  the  steps  for  acquiring  the  specific  behavioral  competancles  (skills). 

The  applied  training  approach  Is  an  active,  Intense  and  highly  Interactive 
teaching-learning  process  providing  the  opportunities  throughout  for  Integrat¬ 
ing  the  cognitive,  behavioral  and  affective  domains  In  each  learning 
setting, consistently  reducing  the  number  of  Internal  blocks  to  learning 
and  change  and  Increasing  productive  performance  potential  that  Is  cost 
effective.  The  process  learning  Is  EAGT;  where  Experience  Is  shared  struct¬ 
ured  experience  In  the  stimulus  setting.  Articulation  of  the  experience, 
verbalizing  behavior,  affect  and  cognitions.  Generalizing  iwjor  learnings 
from  the  experience  and  Transferring  through  Identification,  ."Otentlal  use 
In  a  novel  situation.  TKe  model  moves  the  learner  In  his/her  own  reality, 
moving  from  what  they  know.  Identifying  the  new  structures  for  learning 
and  then  returning  to  a  new  reality,  where  the  learner  can  adopt  the  new 
learnings.  The  applied  training  Is  designed  In  four  segments  to  be  held 
over  a  three  month  span  providing  opportunity  for  transfer  to  the  real 
world  and  accountability  for  learnings  upon  return  to  the  next  training 
session. 

The  design  of  the  applied  training  capitalizes  on  closing  the  "practice 
to  use"  gap,  Implementing  the  new  learnings  as  soon  after  the  experiences 
of  "try  out".  To  get  a  fair  try  out,  participants  plan  before  they  leave 
the  session  how  they  will  Implement  the  new  learnings  In  their  day  to  doy 
activity  and  when  they  return  to  the  next  session,  the  session  begins  by 
reviewing  implementation  and  dealing  with  the  problems  before  we  begin 
new  subject  matter.  Actually  providing  opportunities  for  try  out  of  new 
behaviors  Is  the  beginning  of  the  relevant  attitude  change. 
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The  scope  of  this  applied  training  design  addresses  these  objectives: 


Primary 

Objective 


Improving  the  attitude  toward  utilization  of 
women  personnel  (enlisted  and  officers)  in 
a  military  effective  climate  and  specifically 
improving  and  Increasing  the  utilization  of 
of  female  personnel  demonstrating  cost 
effectiveness 


0.  Secondary 
Objectives 


•  Training  women  in  appropriate  military 
organizational  behaviors  that  promote  success¬ 
ful  career  pattern*. 

*  Developing  the  self  management  know-how  and 
leadership  skills  of  female  military  personnel 
at  pre-supervisory,  supervisory,  and 
management  levels. 


•  Improving  military  personnel's  sttltude  toward 
successful  job  performance  of  women  In  new  roles 
and  non^ traditional  career  ladders,  as  well  as  In 
combat. 

•  Improving  both  officer's  and  enlisted  personnel’s 
understanding  of  male/female  socialization  Issues  in 
relation  to  military  job  roles  that  impact  on  the  military 
organizational  structure  and  military  effectiveness. 

For  the  purpose  of  structural  Instruction  and  evaluation,  performances 
objective'  are  specified  for  each  secondary  objective  and  are  specifically 
addressed  in  one  or  more  of  the  four  modules  of  the  applied  training.  The 
treatment  Intervention,  the  applied  training  program/process  Is  the  global 
"xnerlmental  variable.  Independent  In  nature,  and  specified  In  the  context  ot 
the  training  and  evaluation.  The  dependent  variables  are  Identified  In  each 
secondary  objective  (skills,  pattern,  policies  and  practices:  attitudes, 
and  effectiveness)  and  are  measures  of  success  as  a  result  of  the  tuning. 
Figures  4  and  5  describe  variables  and  end  results  In  relation  to  military 
effectiveness  focusing  the  major  Intervening  variables.' 


1 


See  Figures  4  and  5  for  description  of  Variables 
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A  description  of  the  format  for  the  applied  training  cirrlculum  follows; 
the  Applied  training  involves  about  30  participants  per  session,  with  at 
least  two  professional  staff.  A  training  input  consists  of  eight  two 
and  one-half  day  periods  scheduled  over  a  three  month  time  span.  Two 
periods  are  described  as  a  phase.  A  topical  outline  of  the  four  phases  is 
attached  in  Appendix  A.  The  four  phases  are; 

I.  Focus  on  Management  of  Self 

II.  Focus  on  the  Organization:  Management  Skills 

III.  Focus  on  the  Organization  -  Supervisory  Skills 

IV.  Management  in  the  Military  System 

The  sessions  would  provide  the  input  of  Information,  theory  and  substantive 
content.  Additionally,  the  sessions  Include  experiential  exercises,  and 
role  P'l*y  simulations  to  try  out  new  alternative  approaches,  plus  opportunities 
to  oractlcc  new  behaviors  In  preparation  for  application  In  the  real 
»,vrld  on-the-job. 

The  process  approach  Is  carefully  designed  to  mept  the  needs  of  the  participants 
and  training  sessions  are  adapted  to  participants’  requirements  and  pace. 
Presentations  will  be  supplemented  with  written  manuals  that  will  develop 
the  participants  knowledge  and  awareness  of  the  dynamics  of  functional 
management  in  the  military  organizational  structure.  This  will  be  related 
successful  pcrfo?T»r;c*  and  career  productivity  in  developing  appropriate 
*->-«!$,  rules  and  styles  of  appropriate  behavior  and  communication  between 
..ior«en  and  men  in  the  armed  forces. 


Specialized  emphasis  Is  designed  to  specifically  respond  to  needs  of  each 
special  training  group.  For  example,  as  enlisted  personnel,  groups  lj  - 
4j  would  receive  emphasis  on  pre-supervisory  and  supervisory  imput.  In 
contrast  \sum1ng  officer  groups  5j  -  8j  have  already  been  exposed  to 
supervisory  patterns,  emphasis  would  be  placed  on  management  Input.  If 
any  officer  groups  lacked  exposure  to  supervisory  patterns,  then  this  would 
he  provided  in  the  special  training  sessions.  In  the  mixed  groups  (M*), 
the  Initial  sessions  would  be  held  in  separate  sex  groups,  developing 
understanding  of  self  and  others  prior  to  point  of  developing  specific  process 
learnings  together.  Both  male  and  female  participants  would  then  be  combined 
in  sessions  21,  III,  and  IV. 

The  development  needs  of  the  trainees  are  addressed  In  a  common  core  of 
content  in  the  general  sessions.  Each  session  is  designed  for  full  p*<*t1c1p- 
atlon  of  the  trainees.  Three  subsequent  aspects  address  the  developmental 
needs  of  each  person  separately.  (1)  at  tne  end  of  each  training  segment 
a  brief  perToii  Is  provided  for  the  individual  trainee  to  identify  and 
associate  the  specific  meaning  for  her/his  development  and  plan  for.  try 
out  ia  the  interim.  (2)  Additionally,  there  are  scheduled  individual 
"briefing1*  periods  for  each  trainee  to  discuss  her/his  own  developmental 
needs  and  issues  and  to  receive  specific  guidance  and  counseling  from 
the  staff.  (3)  Opportunities  for  career  olanning  are  ps'ovlded  outside 
of  the  training  assignment  to  those  who  wish  to  focus  long  rang#  plans, 
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In  summary,  it  would  appear  that  analytic  approaches  to  optimisation  in 
adaptive  Instruction  are  possible*  and  that  the  treatment  intervention  broadens 
the  choice  In  method  objectives  conslderlno  the  manv  faceted  asoects  of  the 
problem,  that  the  research  addresses.  Although  the  technical  problems  In 
*  elementing  the  research  design  at  three  sites  are  formidable,  they  do 
not  appear  intractable. 


^SEARCH  ACTIVITY: 


Evaluation  Effectiveness  of  Apol 1 ed  yrainlng 


The  research  activity  proposed  for  this  project  is  the  use  of  a  computer- 
managed  value  rating  (CMYR)  model  for  evaluating  the  effectiveness  of  an 
applied  training  program.  The  model  derives  from  a  system  developed  by 
Collet  (1972)  to  compare  outcomes  of  simulated  evaluation  projects  with 
hc  addition  of  value  rating  procedures  adapted  from  Edwards  (1977)  before 
■>r~ anting  the  CMVR  evaluation  model  in  its  entirety,  It  is  appropriate  to 
describe  the  measurement  technique  which  is  Its  salient  characteristic. 


I 


s  the  aim  of  the  CMVR  model  to  express  all  evaluation  results  (out- 
variables)  in  a  simple  metric  scale  (Labeled  E),  which  is  Intuitively 
"oaningful  to  laymen,  yet  amenable  to  statistical  manipulation.  In 
uditlon,  it  is  expected  that  the  proposed  E  scales  will  provide  both  a 
•iurs  of  estimating  the  cost  of  total  achievement  of  program  objectives, 
snd  a  direct  comparison  of  the  cost-effectiveness  of  programs  having 
fundamental 1y  different  objectives. 

To  Illustrate  the  basic  concepts  of  the  model,  assume  that 
there  are  pre  and  cost  scores  on  several  standardized  tests  for  each 
-erson.  Computer  Managed  Value  Voting  refers  to  a  procedure  In  which 
j  computer  interacts  with  a  relevant  set  of  experts  to  develop,  for  each 
oependent  variable  (standardized  test),  a  formula  for  transforming  each  raw 
score  Into  an  effectiveness  Index  or  t  score,  where  E  Is  i  number  ranging 
from  0.0C  to  1.90  which  renresents  the  degree  to  which  the  objective  as 
measured  by  that  variable  has  been  achieved.  The  basic  function  of  the  CMVR 
computer  program  is  first,  to  help  each  expert  validly  represent  his/her 
/a lues  in  metric  form,  and  second,  to  achieve  an  optimal  degree  of 
consensus  among  experts.  The  computer  output  consists  of  a  set  of 
transformation  rules.  The  specific  strategies  for  developing  these  trans¬ 
formation  rules  and  suggestions  for  assessing  measurement  error  are  discussed 
order  the  heading  of  research  strategies. 

'The  description  assumes  that  objectives  are  stated  in  terms  of  absolute 
r.Mevement,  that  is,  referenced  to  specific  criterion  levels  for  success 
■s^o  "on-success.  (The  procedure  adapted  to  studies  of  objectives  stated 
->  elative  terms.) 
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COMBINING  DEPENDENT  VARIABLES:  After  the  transformation  operations,  both 
the  pre  and  post- treatment  performance  of  each  person  is  represented  by  a 
set  of  E  scores.  To  get  a  single  estimate  of  a  person's  overall  performance, 
It  Is  necessary  to  determine  the  relative  Importance  of  each  dependent 
variable  In  assessing  the  achievement  of  the  program  objective.  The  CHVR 
computer  program  can  use  essentially  the  same  strategy  as  for  the  development 
of  E  scores  to  Interact  with  the  experts  to  produce  a  set  of  normalized 
weights  (l.e.  the  weights  sum  to  1.00)  which  possess  this  property,  The 
sun  of  the  cross-products  of  a  person's  E  scores  and  the  associated  weight 

would  then  produce  a  composite  E  score  for  both  pre  and  post  tests.  These 
composite  E's  would  be  numbers  ranging  for  0.00  to  1.00  representing  the 
degree  of  achievement  of  the  objective  as  measured  by  the  ENTIRE  SET  of 
dependent  variables. 

MULTIPLE  OBJECTIVES;  In  the  above  illustration  a  single  program  objective 
was  assumed  for  the  sake  of  simplicity.  However,  the  procedure  Is  easily 
adaptable  to  the  assessment  of  programs  having  multiple  objectives  by  using 
the  above  procedure  to  develop  a  set  of  weights  representing  the  relative 
Importance  of  each  objective  in  achieving  a  successful  program.  Strategies 
for  dealing  with  various  levels  of  objectives  (e.g.,  terminal  vs.  enabling 
objectives)  are  suggested  in  a  subsequent  section. 


PROGRAM  EFFECTIVENESS:  One  of  the  convenient  attributes  of  the  E  metric 
Is'  that’ the'  mean'  £  score  (both  fur  the  Individual  dependent  variables  and 
for  the  composite)  represents  the  degree  to  which  the  entire  qroup  achieved  the 
objectives.  The  difference  between  pre  and  post  composite  E  means  (mean  E 
gain)  would  be  a  measure  of  the  degree  of  movement  towards  the  objectives 
during  the  administration  of  the  program. 

ESTIMATING  PROGRAM  COST:  The  mean  E  gain  should  be  useful  In  developing 
general Izable  estlmaeeT  of  the  program  cost.  The  actual  cost  of  the 
program  divided  by  the  total  E  gain  would  provide  an  estimate  of  the  unit 
cost  of  moving  one  person  from  complete  non-achievement  to  complete 
achievement  of  objectives,  assuming  uniform  grov/th.  This  UNIT  COST  ought 
to  he  useful  in  predicting  the  cost  of  moving  a  new  group  from  their 
Initial  performance  level  to  the  desire  criterion. 

ASPECTS  OF  THE  EVALUATION  TECHNOLOGY  1 


•  Program  analysis 

•  Taxonomy  of  Evidence 

•  Concept  of  lexicographic  processing 


1  Detailed  explanation  and  derivation  cf  each  Is  available  J!J 
writing  Marilyn  E.  Harris,  Ph.D,  80G  Metropolitan  3ldg.,  Flint,  Ml  48502 
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•  Concept  of  theoretical  evidence 

•  Measurement  techniques: 

-  E  transformation 

-  importance  weights 

-  error  assessment  in  E  indices 

•  Simplication  tlirogh  computer  management 

An  important  first  step  in  evaluation  is  to  analyze  the  relationship  betv/een 
instructional  activities  and  intended  outcome.  The  power  of  this  procedure 
is  that  it  highlights  potential  weak  links.  Information  about  the  efficacy 
of  each  instructional  transaction  is  important  for  diagnostic  purpose  in 
overall  goal. 

An  example.  Figure  3,  defines  the  task  to  evaluate  a  program  to  train 
military  personnel  in  effective  knowledge  and  skills  with  respect  to  its 
terminal  objective  of  improving  military  effectiveness.  The  Inferential 
‘'chain  required  to  link  this  instructional  activity  to  the  designed  outcome 
behavior  Is  Illustrated  in  Figure  3.  Each  of  the  numbered  boxes  represents 
a  potential  source  of  information.  If  an  evaluation  project's  only  Interest 
is  In  determining  whether  or  not  the  terminal  objective  was  achieved, 
information  about  Dersonnel  achievement  would  provide  sufficient  evidence  for 
the  decision.  However,  if  the  purpose  is  to  provide  either  formative 
evaluation  information  or  information  about  the  degree  to  which  the  terminal 
objective  had  been  achieved,  observations  should  be  made  at  each  of  the 
noints  between  1  and  17.  This  would  be  particularly  true  in  a  longterm 
project  such  as  this  example,  since  one  would  expect  the  program  effects  to 
slov/ly  filter  further  and  further  down  the  Inferential  chain  with  the 
passage  of  time. 

PROJECTED  DELIVERABLES 

The  fol lowing  list  Is  Intended  only  to  suggest  the  character  and 
extent  of  the  potential  products; 


•  Comprehensive  description  of  the  applied  training  program 
with  nanual 

•  Evidence  of  the  degree  to  which  the  program  was  actually 
Implemented 

•  Evidence  of  the  degree  to  which  the  program  achieved  its 
objectives 

•  A  description  of  the  CMVR  evaluation  model  Implementation 

•  Evidence  of  the  utility  of  the  CMVR  model 
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Shi*811?05/0!,  imf,lem€n^1r>9  the  training  program  and  CMVR 
techniques  at  other  installations 

the  ?‘,VR  evaluat*°n  techniques.  This  would 
include  practical  examples  and  specific  user  instructions 

thTriw  3raI5  and  °2h?r  ^Ical  products  associated  with 
Jval“at1?n  technique.  Ail  programs  would  be  transport¬ 
able  to  computer  installations  with  FORTRAN'  IV  capability  P 
(Mesaory  reoulrements  are  anticipated  to  be  minimal.) 

The  expected  end-results  are  both  tangible  and  intangible.  In  private 
sector  applications,  intangible  results  are  documented  in  reduced  stress 

a?d  bens*°n  on-the-job,  attractiveness  to  non  monetary  factors,  positive 
attitudes  and  increased  trust  and  team  spirit.  Tangible  results  are 
evidence  for  example:  by  the  acquisition  of  increased  leadership  skills 
in  decision  making  and  problem  solving,  improved  communication  skills, 
interpersonal  and  organizational  and  handling  of  conflict  in  on-the-joh 
settings.  Further,  participants  acquire  basic  management  skills  in  olanning, 
organizing,  stuffing,  directing  and  controlling.  These  improvements  should 
produce  more  effective  performance  of  the  unit  In  which  the  women  trainees  are 
assigned.  Including  less  lost  time  and  better  use  of  total  personnel  resources. 

The  tested  benefits  to  each  trainee  are: 

Identified  level  of  competence  in  specific  areas 

*  Improved  and  more  confident  self-image 

*  Identified  goal  directed  behaviors 

Identified  “how  to  process  skills  Immediately  applicable 
both  on  the  job  and  off  the  job 

*  Measured  performance  effectiveness 

*  Acquired  technical  skills  applicable  to  both  traditional 
non-traditlonal  job  processes 

planning*"01*  Car°°r  COUnseH"9  and  enrichment 

Organizational  understanding  and  prospective 

In  the  Private  sector,  these  improvement  have  typically  resulted  in 
a  ^niHl  ^dget  savings  of  5%  (range  5%  -  2055)  for  the  orranizat ion 

Jnt,dMted  *  "'"fry  application  of  lhLe 

In  a  standard  private  sector  application,  a  "trainino  of  fcriin™»«:" 

iTXr rs  to  * 


1348 


) 

t 


\ 

i 

i 

$ 

X 

1 

I 

; 


Additionally,  at  the  conclusion  of  the  applied  training  program  It  will 
be  possible  to  identify: 

•  The  consequences  of  integrating  significant  numbers  of 
women  Into  career  positions  (e.g.,  positions  having  career 
ladders)  and  Identify  career  ladder  implications  In  training 
adoption 

•  Recommended  percentage  of  Integration  to  give  best  cost  effectiveness 
results 

•  The  percentage  of  personnel  (male  and  female)  benefited  by 

the  applied  training;  and  specific  benefits  through  objectives  success 
fully  met 

•  The  degree  of  benefit  to  the  personnel 

•  Attractiveness  of  the  job/career  to  Individual 

•  The  benefits  to  the  organization 

•  The  cost  savings  In  human  resource  utilization  qualified  perform¬ 
ance/unit  cost  (dollar  figures) 

•  Attitude  change  from  confronted  sex  differences 

•  The  durability  of  the  training  over  time 

•  Identification  of  key  factors  in  military  effectless 

•  The  social  acceptability  of  the  program  In  the  military 

•  "ore  attractiveness  of  non-monetary  aspects  of  military  life. 


CONCLUSION 

Considerable  study  and  review  of  the  several  social  and  military  fetors 
which  affect  nilltary  effectiveness  reveals  that  the  armed  forces  have 
similar  problems  that  exist  In  the  private  sector  as  related  to  the 
integration  of  women  In  the  work  world.  The  areas  where  the  unknowns  are, 
are  also  very  similar  even  including  the  combat  issue,  wnen 
Is  focussed  as  the  measurable  dynamic.  In  the  private  sector,  large  firro* 
are  experimenting  "1th  similar  techniques  as  those  in  the  applied  training 
approach  throughout  whole  systems,  and  finding  that  change  Is  possible. 

empirical  conclusions  recognize  the  changes  must  be  a  planned  chanqe 

intentional  in*  0f  one's  own,  volitions;  In  contrast  to  amwint  or 
revolutionary  change.  The  planned  change  must  be  oeavlly  oriented  toward 
education  of  all  personnel  —  education  related  co  the  social  psychological 
aspects  of  human  Interaction  both  male  and  female  interacting  In  the 
work  setting.  Understandings  of  the  socialization  processes  and  Its 
consequences  provides  a  real  basis  for  understanding  -,ex  differences  and 
to  value  Individual  differences  over  «x  differences  --  wmen  leads  to  . 
selection  of  personnel  based  on  the  Individual's  own »  •Mllty ■.  ErrectWe-^ 
-ess  clearly  will  be  affected  as  we  gain  skill  In  1<3er.t1fy1ns  tnjcoijponents 
of  a  task,  how  to  train  for  it  and  how  to  assess  successful  performance.  Then 
as  competence  criteria  are  met  in  a  population  effectiveness  .naybe  attained. 


1349 


Cost  effectiveness  Is  likely  to  result  In  the  Integration  of  women  Into 
the  military  as  well  as  In  the  private  sector  as  long  as  women  clearly 
remain  In  the  minority  and  are  trying  to  demonstrate  equality.  The  amount 
of  savings  will  be  dependent  on  the  quality  of  the  volunteers  recruited. 


In  order  to  construct  useable  theory  in  the  area  of  female  Integration 
in  the  armed  forces  we  believe  the  applied  training  aooroach  is  worth 
testing  In  a  comprehensive  research  project  as  described  here.  It  Is  important 
for  the  military  institution  to  take  initiative  in  providing  the  education* 
al  intervention  model  to  confront  change  of  the  salient  deoendent  variahl* 

In  a  comprehensive  research-study  approach.  And  while  the  applied  training 
and  Its  evaluation  is  still  far  from  a  finished  product.  It  offers  one 
way  of  resolving  a  number  of  fundamental  problems  In  military  effectiveness. 
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APPENDIX  A  -  TOPICAL  OUTLINE  OF  PHASES 
PHASE  I:  FOCUS  ON  MANAGEMENT  OF  SELF 

Opportunities  to  understand,  discuss  and  experience  the  dynamics 
of  the  search  for  Identity  In  a  society  of  many-changlng 
roles. 

*  Personal  Issues:  male/female  needs  for  approval  and  sue  »$s, 
feelings  of  ambivalence,  conflict,  avoidance,  sexuality 

4  Stereotypes  and  sex  roles:  ryths,  facts,  reality, 
socialization  and  change 

*  Motivation  and  achievement:  socialization,  drives, 
needs 

*  Legal  boundaries  and  constraints:  laws,  processes  and 
procedures,  support  and  action 

*  Values  clarification:  value  basis,  development  and 
ethics 

*  Decision  making  and  action:  data  collection,  analysis, 
effects  and  consequences,  evaluation 

*  Interpersonal  communications:  cycle,  openness  and  con* 
structlve  feedback,  language 

Integrating  an  Image- -developing  an  assertive,  affirmative  you. 

PHASE  II:  FOCUS  ON  THE  ORGANIZATION:  LEADERSHIP  ATTITUDES 

Opportunities  to  learn,  sharpen  skills  and  try  out  new  methods 
for  productive  performance  In  job  and  work  settings. 

*  Organizational  communications  skills:  use  of  Information, 
systematic  flow,  written  vs.  oral 

*  Decision  making  and  problem  solving:  diagnosis 

*  Leadership:  functional,  resource  utilization,  variability 
In  style,  task  and  maintenance  roles 
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Goal  setting  and  action  planning:  specification,  focus, 
diagnosis  direction,  designing,  responsibility 


*  Intervention  and  use  of  Influence;  alternatives,  tasks, 
types,  practice 

*  Conflict  management:  awareness,  ownership,  expression  and 
articulation,  alternatives  In  dealing 

0  Accountability  and  responsibility  taking:  specifying  task 
and  time,  checking 

*  Management  process:  planning,  organizing,  staffing, 
direction,  controlling 

Working  In  actual  situations  to  develop  skills  and  useful  job  attitudes 
for  use  outside  the  workshop  setting. 

PHASE  III:  FOCUS  ON  THE  ORGANIZATION:  SUPERVISORY  SKILLS 

Opportunities  to  develop  specific  skills  applicable  on  the  job 
both  traditional  and  non-traditlonal . 

*  Teambuilding:  resource  Identification,  relevant  goals, 
Integration  of  affect 

*  Delegation:  Identifying  specifics,  acting,  growing  res¬ 
ponsibility  and  authority 

*  Time  Management:  planning,  Implementation,  discipline 

*  Performance  Evaluation:  values.  Improvement,  assessment 

*  Training  others:  psychology,  theory,  practice,  experiences 

*  Management  Development  -  Goal  :etting,  long  term  development 

Integrating  skills  In  on-the-job  situations  considering  planned 
change. 

PHASE  IV:  MANAGEMENT  IN  THE  MILITARY  SYSTEM 

Opportunities  to  plan  specifically  for  your  future  needs  and  the 
needs  of  others.  I  cam  how  to  use  new  knowledge  and  skills  In 
the  military. 
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Planning  for  the  future:  long  range  plannino  nh«i«o 
development  and  implementation  P  9*  phasing. 

Support  groups:  components,  need,  criteria  and  use 

*  IhTllml  process  Cha"9'"9  r0,e$;  re5fstJ"«-  "PPOrt. 

*  -  programing, 

^•rtsrgsji:  andt*$t  *• 
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Presentation 

glean  by  Hear  Admiral  Q.  Fiebig  /  Colonel  H«£,  Seuberlich 
on  the  occasion  of  the  19th  Conference  of  the  Military 
Testing  Association  at  San  Antonio,  Texas,  1977 


I»  Introduction 

(Slide  1) 

The  "Funktionsanalyse  Personalstruktur"*^  aMo'-i  will  be  dealt  with 
here  today  constitutes  a  part  of  the  fundamental  work  aiaing  at  a 
new  Manpower  structure  of  the  araed  forces  of  the  Federal  Republic 
of  Germany,  This  fundamental  work  is  based  on  the  results  of  the 
investigations  carried  out  by  the  Manpower  Structure  Commission  of 
the  Federal  Ministry  of  Defense  in  1971 1  the  work  of  that  Commission 
was  expounded  before  the  MTA  already  on  the  oocasion  of  the  annual 
conferences  of  1972  and  1973*  Nevertheless  I  would  like  to  outline, 
as  an  introduction,  a  few  fundamental  deliberations  of  the  Manpower 
Structure  Coawission  as  well  as  the  essential  elements  of  tb'tir 
proposals,  in  order  to  delineate  once  sore  the  interrelations  and 
dependencies  of  that  work  with  regard  to  the  instruments  of  the 
"Funktionsanalyse  Personalstruktur,,+^ . 

then  the  c.nuaes  of  the  continuously  increasing  difficulties  in  our 
armed  forces  with  regard  to  personnel  were  investigated  it  was 
realised  at  tnat  time  that  the  problems  with  which  the  Federal  Araed 
Forces  are  faced  in  the  personnel  sector  cannot  be  solved  at  short 
term.  Only  a  thoroughly  new  manpower  structure  of  the  armed  forces, 

+  Analysis  of  the  functions  pertaining  to  a  job 
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jrhich  corresponds  to  ths  industrial  society  eith  its  principle 
of  division  of  labour,  would  be  able  to  bring  about  a  decisive 
improvement*  The  aair  problem  evened  to  be  the  following! 

In  order  to  be  able  to  maintain  their  effect  of  deterrence,  the 
arned  forces  need  continually  new  and  aore  and  aore  efficient 
weapon  ay, stems  which,  however,  require  a  steadily  increasing 
proficiency  of  the  personnel* 

(Slide  2) 

For  this  reeson  the  systeaatie  build-up  of  a  modern  manpower 
structure  is  indispensable*  That  structure  aueti 

1*  correspond  to  ths  aia  of  tho  organisation} 

£•  take  into  account  the  capabilities,  the  age,  the  personal 
characteristics  and  the  professional  expectations  of  the 
individual  soldier,  and 

3*  be  adapted  as  far  na  possible  to  the  external  conditions, 
tiuch  as  politico-economic  developaents  and  labor  aarkst* 

The  basis  of  the  further  deliberations  of  the  Coaaission  was  that 
the  entire  service  law  has  its  original  point  of  reference  in  the 
individual  functions*  Under  this  aspect,  an  essential  eleaent  of 
the  concept  ^an  the  proposal  systematically  to  record  and  arrange 
in  u  nee  say  the  interrelations  between  function,  responsibility, 
pay  and  rank.  Such  a  reorganisation  could  only  be  attained  on  the 
basis  of  the  description,  analysis  and  assessaunt  of  all  functions 
in  the  araed  forces. 
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(Slid*  3) 

This  reorganisation  ahall  iat*r  alia  strvt  for  aolving  th* 
following  probl*a»u 

-  determination  of  a  ayatsm-oriented  personnel  organisation 

-  indication  of  job  characteriatica 

-  derelopaent  of  correlated  profiles  of  requireaenta  which  auat 
be  fulfilled  ami  of  correlated  profiles  of  qualification 

-  building  up  of  assignment  packages  with  employment-oriented 
training,  and 

-  weighting  of  functions  in  stages. 

For  that  purpose  it  is  necessary  to  consider  the  individual 
functions  with  regard  to  the  correlation  between  performance 
and  efficiency.  That  aeaas  that  those  jobs  which  are  interrelated 
with  regard  to  th*  work  to  be  performed  and  in  which  th* 
cooperation  is  directed  towards  a  specific  air  or  result  shall  be 
combined.  In  other  worda*  th*  achievements  obtained  by  th*  work 
on  several  jobs  aay  be  necessary  for  attaining  a  specific  aim. 

(Slide  4) 

Such  a  correlation  between  performance  and  efficiency  is 
characteristic  of  units  and  sub-unit#  the  organization  of  which 
is  based  or  the  principle  of  division  cf  labor.  This  point  is 
particularly  important  because  no  foundations  for  that  work  in 
the  sense  of  the  goals  pursued  by  the  Commission  were  available, 
ani  therefore  it  became  necessary  to  develop  specific  instruments. 
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II«  Other  Projects 

Apart,  fro*  the  Manpower  Structure  Coaaission,  also  other  commissions 
were  working  at  the  aaae  tiae  in  the  doaain  of  the  Federal  Minister 
of  Defense. 

(Slide  5) 

In  the  same  year,  1971 «  reports  were  submitted  by: 

•  the  Force  Structure  Coaaission  (its  first  report  on  equity  in 
induction) 

•  the  Education  Coaaission  (the  expert  report  on  the  reorganisation 
of  education  and  training). 

It  had  a  negative  effect  that  the  Federal  Qovernaent  ordered  iaaediate 
aeasures  to  be  taken  in  the  field  of  training  and  education,  for 
example  by  the  foundation  of  universities  of  the  Federal  Araed  Forces, 
although  a  reorganisation  of  the  araed  forces  ought  to  have  been 
initiated  only  after  subaiasion  of  the  second  report  of  the  Force 
Structure  Coaaission  containing  proposals  for  a  new  force  structure 
at  the  end  of  1972.  For  this  reason  it  was  not  possible  in  each  case 
to  coordinate  the  interdependencies  in  a  logical  aanner  appropriate 
t.o  the  eatter  in  question.  This  also  impeded  the  fundamental  work  in 
the  field  of  manpower  structure  soaetiaea  considerably.  But  the  overall 
galas  remained  unchanged. 

The  work  aiaing  at  the  developaent  of  instruaents  for  the  description, 
analysis  and  assessaent  of  functions  la  the  araed  forces  began  as 
early  as  in  1971.  At  first,  certain  procedures  applied  in  economy, 
la  public  service  and  in  friendly  forces  were  investigated,  but  these 
procedures  did  not  correspond  to  the  goals  of  the  Manpower  Structure 
Commission.  Thus,  the  Federal  Armed  Forces  were  to  soae  extent  faced 
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with  new  ground  in  the  scientific  field,  because  all  functions 
to  be  performed  in  coapliance  with  the  mission  assigned  to  the 
araed  forces  had  to  be  recorded  systematically  and  according  to 
unifora  aspects.  For  that  purpose,  it  was  on  the  one  hand 
necessary  to  subdivide  functions  into  specific  elements,  and  on 
the  other  hand  to  assign  skills,  knowledge  and  capabilities 
required  for  a  function  (a  job)  to  the  appropriate  function 
eleaents.  Thus  a  two-stage  set  of  instruaents,  called  "Funktions- 
analyse  Personalstruktu.*"+\  was  created  with  the  support  of 
scientists. 


(Slide  6) 


In  197?  this  set  of  instruments  called  "Funktionaanalyse  Personal- 
struktur"4^  was  subjected  to  a  first  coaprehensive  testing. 

5,400  soldiers  of  all  ranks  were  interrogated  according  to  the 
principle  of  qualitative  random  selection;  the  soldiers  concerned 
were  engaged  in  the  following  fields: 

-  infantry  employment, 

-  aeronautical  eaployaent, 

-  nautical  eaployaent, 

-  personnel  administration. 


The  evaluation  as  a  whole  had  a  satisfactory  result.  But  the  horlson 
of  expectations  of  the  Federal  Ministry  of  Defense  was  not  yet 
entirely  satisfied. 


(Slide  ?) 


T> 


Analysis  of  the  functions  pertaining  to  a  job 


^59 


6 


The  aain  deficiencies  were* 

-  collection  of  too  coaprehensive  data  aaterial, 

-  neglect  of  skills  as  coap&red  with  knowledge, 

-  lack  of  a  seal*  for  detersining  levels  of  objectives 
of  learning. 

For  this  reason  the  set  of  instruaents  "Funktionsanalyse  Personal* 
struktur”*)  was  syateaatically  revised  froa  197^  onward.  Points 
of  aain  effort  were  the  developaent  of  appropriate  ADP  prograa 
structures  as  well  as  a  repeatedly  extended  scientific  support 
of  the  project,  which  concerned  specific  iteaa  of  the  latter. 

(Slide  8) 

You  eaa  see  the  results  in  this  scheaatic  representation  of  the 

FUNKTIONSANALTSI  PERSON AIBTBUKTUR4^ 

The  flow  diagraa  is  divided  into  seven  phases.  The  three  aain 
eleaents  are 

*  the  deteraination  and  registration  of  tasks, 
in  phases  1  to  3 

-  the  analysis  of  requl resents  which  aust  be  fulfilled, 
in  phases  4  to  6 

*  the  suaaarising  evaluation, 
in  phase  7, 

^Analysis  of  the  functions  pertaining  to  n  job 
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At  present,  all  individual  Measures  are  based  on  this  reliable 
procedure*  Thus  it  was  posiible  to  reduce  froa  1976  onward  the  ] 

scientific  research  work*  Since  that  tine,  the  work  of  the 
scientists  gradually  assuaes  the  character  of  project  assistance, 

i 

as  the  work  can  be  done  to  an  ever  increasing  extent  by  the  ] 

Federal  Araed  Forces  theaselves  in  the  future.  I  shall  revert 
to  this  aatter  when  Colonel  Seuberlich  will  have  presented  to 
you  the  inetruaents  in  detail* 

III*  The  Inventory  of  Inatruaenta 

(Slide  8) 

The  aulti-stage  inventory  of  inatruaenta  called  "Funktionsanalyse 
Personelstrukt>irH+^  serves  for  the  uni  fora  description  and  analysis 
of  all  functions  to  be  perforaed  in  the  araed  forces  and  for  the 
collection  of  evaluation  criteria  in  7  phases  and  35  steps* 

(Slide  9)  ! 

$ 

'•Function1'  is  to  be  understood  as  the  totality  of  tasks  to  be 
perforaed  in  a  particular  job.  Thus  the  task  is  the  aain  eleaent 
of  description  to  which  the  requir-eaents  which  aust  b*  fulfilled 
are  referred* 

i 

t 

The  prerequisite  for  coaparability  within  the  araed  forces  is  a 
unifora  description  of  tasks*  The  individual  description  eieaents 
and  criteria  are  to  be  foraulated  as  uniforaly  as  possible  and 
*.  •  specifically  as  necessary  for  all  functions  in  all  services. 

! 

By  "task"  we  understand  the  totality  of  purpose-oriented  actions 
which  are  in  direct  corniLatlon  with  regard  to  perforaance  and 
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efficiency.  Within  the  seeming  of  the  "Funktionsanalyse  Personal- 
struktur,,+'  thr  tasks  and  the  relevant  definitions  of  specific 
foras  of  tasks  are  not  only  eleaents  of  a  function,  but  also 
constitute  the  basis  for  the  distribution  of  responsibilities, 
and  thus  for  the  classification  of  the  function  concerned  into  an 
organisational  structure. 


We  distinguish  between  five  groups  of  tasks) 


1.  Tasks  of  tactical  coaaand  and  control 

The  aia  of  the  task  "tactical  coaaand  and  control*  is  to 
"coabine,  to  organise  and  to  aove  aeans  of  coabat  in  an 
appropriate  aanner,  taking  into  account  the  situation,  and 
to  bring  thea  into  interaction  in  coabat". 

These  tasks  iaply  above  all  a  directing  responsibility  and 
aay  becoae  effective  with  soldiers  down  to  NCO. 


2.  Military  specialist  tasks 

These  are  tasks  listed  in  the  "Catalogue  of  Functions  of 
the  Federal  Araed  Forces"  and  subdivided  into  approxiaately 
60  specialised  groups. 


).  Qeneral  tasks  of  superiors 

These  are  tasks  resulting  froa  the  appointaent  in  question 
or  froa  the  capacity  as  a  superior,  pursuant  to  the  directive 
governing  ailitary  superior-subordinate  relations,  but  not 
froa  a  specific  ailitary  function. 


4,  General  tasks 


7) 


These  are  tasks  pertaining  to  another  specialised  group  which 
do  not  belong  directly  to  the  proper  specialist  tasks  to  be 
porforaed  in  a  specific  specialist  function. 

Analysis  of  functions  pertaining  to  a  job 


1362 


’ '£  vf*^  T*&+fhx ' 


-  9  - 

5*  Additional  task/function 

Task/function  performed  temporarily  or  permanently  by  aoldiara 
during  tha  time  they  balong  to  tha  unit/agancy  in  question, 
basida  thair  ailitary  apacialist  tasks,  their  ganaral  tasks  as 
supariors  and  thair  ganaral  tasks  dapandiag  on  tha  aission 
concerned  and  on  tha  raquiraaants  in  tha  unit/agancy*  Such  a 
task/function  is  not  bound  to  tha  propar  tasks  pertaining  to 
a  job* 

Va  distinguish,  with  regard  to  tha  inventory  of  instruments, 
also  specific  foras  of  tasks  -  apart  froa  these  groups  of 
tasks.  By  specific  foras  of  taskn  wa  understand  particular 
parts  of  tasks*  As  a  rule,  they  coaprisa  several  actions  all 
of  which  are  characterised  by  tha  saaa  degree  of  axacuting 
responsibility  or  diracting  responsibility  and  are  referred 
to  procedures  or  subjects  having  tha  saaa  degree  of  difficulty* 
Within  tha  aaanlng  of  tha  "Funktionsanalyee  Peraonalstruktur"*^ t 
it  shall  be  possible  to  parfora  these  actions  in  a  single  job; 
but  this  does  not  preclude  that  the  holders  of  several  jobs  say 
participate  in  the  actual  perforaance. 

1  have  now  explained  a  few  terse  which  are  very  iaportant 

for  understanding  and  using  the  inventory  of  instruments  of 

+) 

the  "Funktionsanalysa  Personalntruktur"  ,  because  the  aia, 
a  unifora  description  of  tasks,  ean  only  be  reached  if  the 
essential  teres  are  handled  in  a  uniform  Banner* 

1  repeat  that  for  the  "Funktionsanalyee  Peraonalatruktur"*^ 
the  task  is  the  aain  eleaent  of  description  and  that  the 
requireaents  pertaining  to  the  function  in  question  are 
referred  to  the  task*  The  inventory  of  instruaents  is  subdivided 
correspondingly  and  consists  of  two  aain  complexes; 


) 
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-  Part  A  of  the  inventory  of  instrument*  serves  for  the 

determination  and  registration  of  tasks 

-  Part  B  serves  for  the  analysis  of  requirements  which 

must  be  fulfilled* 

The  inventory  of  instruments  is  used  within  the  scope  of  this 
flow  diagram  consisting  of  seven  phases.  As  you  see,  phases 
1  to  3  serve  for  the  determination  and  registration  of  taskst 
up  to  the  job  deacripti^a;  phases  k  to  6  comprise  the  analysis 
of  requirements  which  must  be  fulfilled,  up  to  the  assignment 
of  capabilities*  Phase  7  is  scheduled  for  the  summarising  evaluation, 
up  to  the  setting  up  of  assignment  packages,  i.e*  of  those  new 
means  for  classification  in  the  field  of  personnel  organisation 
which  the  Manpower  Structure  Commission  has  considered  necessary 
for  the  purpoae  of  enabling  a  clearly  arranged  classification  of 
the  new  manpower  structure  of  the  armed  forces* 

IV*  The  flow  diagram 

Mow  a  few  words  on  the  individual  phases  of  the  flow  diagram 
(determination  and  registration  of  tasks)* 

(Slide  10) 

Phase  I 

The  determination  and  registration  of  tasks  begins  with  phase  1,  the 
current  development  of  standardised  instruments  of  inquiry*  It 
concerns  lists  of  tasks,  questionnaires  on  tasks  or  catalogues  of 
tasks*  They  are  prepared  in  technical  talks  with  specialist 
instructors  at  schools  and  with  competent  soldiers  experienced  in 
practice.  Formulations  of  tasks,  as  main  olenenta  of  description, 
must  fulfil  specific  requirements*  These  requirements  ar* 
summarised  in  six  rules* 
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(Slid*  11) 

Rule  It  The  task  shall  be  formulated  in  such  a  way  that  the 
function  concerned  is  expressed  in  terms  showing  the 
goal  of  the  task)  it  shall  also  indicate  exactly  the 
activity  to  be  performed* 


Rule  2:  The  wording  of  the  tasks  shall  be  unambiguous  from  a 
linguistic  point  of  view  and  formulated  in  a  manner 
intelligible  to  all* 


Rule  3»  A  task  shall  constitute  a  working  cycle/complex 

clearly  delineated  with  regard  to  time  required  and 
specialty*  Several  job  holders  may  participate  in 
performing  a  task. 


Rule  4:  The  formulation  of  the  tasks  shall  enable  u  clear 
delineation* 


Rule  5*  The  tasks  shall  not  be  worded  in  too  narrow  a  way* 


Rule  6s  The  tasks  shall  not  be  worded  too  comprehensively, 
i*e.  they  shall  not  cover  the  tasks  of  entire  sub¬ 
units* 


(Slides  12  and  13) 

t 

Example  (wrong/right) 

IThe  various  forms  of  all  specific  tasks  show  differences  with 
regard  to  complexity,  executing  responsibility  and  directing 

* 

j  responsibility.  Correspondingly,  on*  of  the  five  tendency  values 

i  laid  down  in  ascending  order  is  attributed  to  the  forms  of  the 

I 

[specific  task  concerned  for  each  applicable  characteristic  of 
the  task*  I  would  like  to  explain  these  three  termst 
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Complexity  is  a  f*atur*  concerning  the  characterisation  of  contents 
and  the  difficulty  of  tasks.  It  nay  be  caused  by  a  great  nultitud* 
and  nuaber  of  subjects  and  a  great  variety  of  equipment  and 
procedures  at  which  or  with  which  the  task  is  performed.  But  it  aay 
alao  be  brought  about  by  the  coaposition  and  scope  of  the  area  of 
work,  conditioned  on  multiple,  far-reaching  flow  of  work  as  well  as 
on  the  nuaber  of  persons  participating  and  by  their  qualifications. 

By  executing  responsibility  w*  understand  the  responsibility  for 
purpose-oriented  actions,  implying  the  correct  application  of 
procedures  and  methods.  The  level  of  the  executing  responsibility 
within  the  scope  of  a  task  depends  on  the  on*  hand  on  the  degree 
of  independence  and  on  the  other  hand  on  the  significance  and  the 
oonsequences  of  one's  own  actions. 

By  directing  responsibility  we  understand  the  responsibility  for 
purpose-oriented,  guiding  and  steering  influence  on  the  behavior 
of  other  men  as  well  as  the  responsibility  for  purpose-oriented 
employment  of  material  in  compliance  with  the  task.  The  level  of 
the  directing  responsibility  within  the  scope  of  s  task  is  determined 
by  extent  and  importance,  intensity  and  consequences  of  the 
measures  caused  with  regard  to  the  employment  of  personnel  and  material 
for  the  purpose  of  performing  the  task. 

The  importance  of  the  individual  task  te  comp«r.*d  with  the  importance 
of  every  other  task  pertaining  to  a  job  and  assessed  according  to 
ita  contribution  for  the  fulfilment  of  the  mission  of  the  unit  or 
rub-unit  concerned. 

(Slide  1(5) 
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I  would  sunward.**  Phas*  I  as  follows: 

First  of  all, 

-  th*  specialised  activities  to  b*  investigated  are  selected 
and 

-  selection  plans  are  prepared  on  the  basis  of  Military 
occupational  specialties  and  of  sub-units* 


Th*  indications  sad*  on  the  importance  of  the  task  as  cospared 
with  other  tasks  say  differ  widely 

-  "in  normal  duty*1 

-  as  well  as  "during  exercises". 

For  th*  "Fuaktionsanalys*  Personalstruktur"*^ ,  to  perform  a  task 
"in  normal  duty"  means  to  perform  it  under  peacetime  conditions 
or  training  conditions.  For  th*  purpose  of  dif ferenciation,  th* 
designation  "during  exercises"  is  used  when  a  task  is  concerned 
which  is  performed  under  field  or  simulated  war  conditions,  such 
as  they  exist  in  case  of  crisis  and  tension  and  in  a  defense 
emergency. 

It  goes  without  saying  thit  also 

-  "particular  strain"  caused  by 

-  tine  pressure, 

-  strenuous  attitude  during  work, 

-  work  requiriug  difficult  movements, 

-  work  implying  one-sided  movements, 

^'Analysis  o.C  th*  functions  pertaining  to  a  job 
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-  holding  of  heavy  object®,  etc. 

-  as  well  as  tha  “particular  working  conditions**  caused  by 

-  temperature * 

-  heat  radiation, 

•  air  floe, 

-  permanent  noise,  etc* 
will  be  registered* 

As  to  the  use  of  the  various  inquiry  instruments,  it  is 
extremely  important  to  apply  them  according  to  the  ASP  requirements, 
because  othereise  neither  the  recording  in  the  various  necessary 
data  files  nor  the  evaluation  can  be  ensured. 

In  the  course  of  the  next  step,  competent  soldiers  are  interrogated 
according  to  a  given  scheme,  as  to  the  tasks  to  be  performed  by 
soldiers  in  the  functions  and  sub-units  to  be  investigated* 

During  the  third  step,  competent  soldiers  experienced  in  praotice 
comment  on  the  pre-foraulated  tusks* 

Thereupon  the  final  specific  forms  of  tasks  are  laid  dovn  on  the 
basis  ef  task  characteristics,  complexity  and  reaponsibility* 

Theae  tasks  are  then  recorded  by  ASP  in  the  data  fils  on  tasks  and 
in  the  lists  containing  the  epecifio  forme  of  tasks,  for  the  turther 
interrogation  on  AUTHORIZED  tasks,  as  wel'!  as  in  tha  catalogues  of 
questionnaires  on  teaks  intended  for  the  interrogation  on  AvTUAlL! 
performed  tasks* 
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Phase  II 

Phase  II  comprises  an  inquiry  on  ACTUALLY  performed  tasks  and 
a  preliminary  inquiry  on  AUTHORIZED  tasks*  Selection  plans  for 
representative  interrogations  are  prepared  for  the  inquiry  *n 
ACTUALLY  perforaed  tasks*  The  soldiers  are  interrogated  in  groups 
of  about  33  non.  The  procedure  of  answering  of  questions  concerning 
the  catalogues  of  questionnaires  on  tasks  is  siaple*  The  soldiers 
haw*  to  select  the  indications  pertinent  to  the  tasks  they  perform 
and  to  document  thee  with  the  aid  of  foras  to  be  narked  for 
documentation  by  ADP* 

The  inquiries  on  AUTHORIZED  tasks  are  carried  out  in  a  similar  way* 

In  this  context  it  is  important  to  elaborate  the  concepts  on 
AUTHORIZED  tasks,  with  regard  to  the  distribution  of  tasks  to  the 
various  jobs,  in  the  course  of  talks  with  competent  soldiers  according 
to  fixed  rules*  These  soldiers  are  asked  in  which  jobs  which  tasks 
-  and  which  specific  forms  of  taskn  -  are  to  ba  performed*  Besides 
they  must  indicate  which  importance  is  to  be  attributed  to  these 
specifio  fora*  of  tasks  in  normal  duty  and  during  exercises*  Such 
inquiries  on  AUTHORIZED  tasks  can  only  be  carried  out  by  means  of 
carefully  articulated  questionnaire*  which  are  summarised  in  the  list* 
of  specific  foras  of  tasks. 

(Slid#  l6) 

Phase  III 


In  this  phase,  the  evaluation  of  tasks  according  to  the  da termination 
and  registration  of  ACTUALLY  perforaed  as  well  es  of  AUTHORIZED  tasks 
is  important*  This  is  don*  by  ADP*  The  comparison  AUTHORIZED/ACTUAL 
reveals  the  contrast  between  the  result*  of  the  inquiries  os 
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AUTHORIZED  and  on  ACTUALLY  perforaed  tasks  for  &  definite 
doaain.  It  will  than  be  shown  ones  sc re  to  the  saae  coapatant 
soldiars  who  hswa  carried  out  the  first  aasignaent  of  AUTHORIZED 
tasks  in  Phase  IT.  Besides,  experts  in  training,  in  planning 
with  regard  to  aateriai  to  be  used  and  in  organisation  will  be 
consulted  in  Phase  III,  in  order  to  enable  the  clarification  of 
divergencies  between  the  result  concerning  AUTHORIZED  and  ACTUALLY 
perforaed  tasks  and  also  the  storing  of  aaended  AUTHORIZED  tasks* 
The  results  of  that  work  are  unifora  job  descriptions  for  the  araed 
forces  which  will  be  stored  In  a  data  file.  Thus  these  descriptions 
are  already  available  for  various  purposes. 

(Slide  17,  analysis  of  requiresents  which  must  be  fulfilled) 

Phase  IV 

Tii*  analysis  of  requirement.*  which  aust  be  fulfilled  is  initiated 
in  this  phase.  The  requireaents  connected  with  the  tasks  to  be 
carried  out  by  a  Job  holder  aust  therefore  be  determined  and 
recorded  during  the  next  stews.  The  three  categories  concerned 
are  the  following l 

-  capabilities, 

-  knowledge, 

•  skills. 

By  “capabilities"  we  understand  the  relatively  long-lasting 
physical  and  psychic  quality  of  a  person  enabling  the  sequisition 
or  application  of  skille  and  knowledge. 
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By  "knowledge"  «•  understand  what  a  person  aust  know  for 
being  able  to  par fora  a  function  successfully.  Knowladga 
ia  aequirad  primarily  by  theoretical  instruction  and  ia  a 
prerequisite  of  skills. 

By  "skills*1  *a  understand,  on  tha  contrary,  tha  intellectual 
as  wall  as  tha  physical  attainaents  of  a  parson.  Skills  are 
acquired  above  all  during  practical  training. 

Phase  IV  coaprise*  tha  preparatory  work  for  tha  analysis  of 
requirements  which  aust  be  fulfilled}  it  begins  with  the 
collection  of  material.  It  is  based  on  the  list  of  specific 
forms  of  tasks  and  on  nuaerous  documents  of  the  Federal  Armed 
Forces,  ouch  ae  training  plans,  training  programs  etc.  This 
list  aust  be  formulated  on  a  level  which  in  unifora  for  the 
Federal  Armed  Forces  with  regard  to  language  and  contents. 

During  the  next  step,  the  preliainary  collections  of  material 
are  supplemented  in  talks  with  specialist  instructors  and 
competent  soldiers  experienced  in  practice,  by  adding  terms  not 
yet  contained  In  the  collections.  Besides,  the  experts  corrsct 
and  check  the  preliainary  collections  of  material.  The  collections 
asended  in  this  way  are  recorded  by  ADP  and  stored  in  a  data  fils 
on  collections  of  material. 

The  concrete  preparation  of  the  inquiry  begins  in  the  next  step. 

In  this  context,  s  balanced  relation  between  instructors  at 
schools  and  superiors  in  the  troops  is  particularly  important. 

The  selection  of  the  best  experts  for  the  assignaent  of  knowledge 
and  skills  to  the  task  in  question  is  of  decisive  iaportance. 
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Ph***  V 

With  th*  aid  of  the  document*  prepared  the  experts  are  briefed  in 
this  phase  ia  the  procedure  of  assigning  knowledge  and  skills •  It 
is  iaeuabeat  upon  then  to  aasiga  the  necessary  objects,  fundamentals 
and  procedures  to  th*  specific  fonts  of  tasks  they  have  to  deal  with* 

At  the  saae  tine  they  have  to  establish  th*  required  level  of 
knowledge* 

Thereupon  th*  experts  select,  aaong  th*  assigned  procedures, 
those  procedures  which,  within  the  scop*  of  execution  and  of  specific 
fora*  of  tasks,  aust  be  aastered  a*  skills*  Th*  degree  of  aastering 
th*  skills  in  question  will  be  determined  according  to  a  scale  of 
5  values* 

In  th*  next  step,  th* experts  aasiga,  out  of  th*  list  "Particular  strain/ 
working  conditions"  consisting  of  about  20  characteristics,  those 
characteristics  which  are  typical  in  the  execution  of  th*  specific 
forms  of  the  tasks  concerned*  Besides,  degrees  of  strain  are  assigned 
to  these  selected  characteristics;  th*  respective  scale  is  divided 
into  three  values.  These  degrees  of  strain  way  be  referred  ia  different 
way*  to  noraal  duty  or  to  exercisea,  it  is  true,  but  only  the  highest 
degree  of  strain  is  evaluated. 

During  the  last  step,  th*  experts  deal  with  th*  capabilities  required 
for  executing  a  specific  form  of  a  task*  They  select  these  capabilities 
froa  a  catalogue  in  which  20  capabilities  are  described.  Th* 
classification  is  aade  according  to  a  scale  of  five  values  showing  a 
nuaber  of  examples*  These  indications  are  recorded  on  an  in;juiry  fora 
capable  of  beiiu;  used  for  ADP* 
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Phass  VI 


During  this  phss*,  the  data  obtained  by  th*  inquiries  are  checked 
and  stored*  Th*  results  are  printed  out  in  the  fora  of  profiles 
for  requireaents  which  aust  be  fulfilled  with  regard  to  "specific 
foras  of  tasks"*  Such  a  profile  contains  indications  concerning; 

1*  necessary  knowledge  of 

-  objects  at  which  th*  specific  fora  of  th*»  task  is  *x*cut*dt 

-  objectu  with  which  th*  specific  fora  of  the  task  in  executed! 

-  fundaa*ntal«! 

-  procedures  and  aethods  applied  for  executing  the  specific 
fora  of  the  task| 

2*  skills  required  for  applying  or  perforaing 

-  procedures  or  aethods  used  for  executing  the  specific  fora 
of  the  task; 

3*  "particular  strain  /  working  condi tionn" ; 
capabilities. 

The  thus  established  profiles  for  the  requireaents  which  aust  be 
fulfilled  are  subnit ted  to  a  selected  group  of  experts  aaong  specialist 
instructors  and  coapetent  soldiers  experienced  in  practice  ss  well  ss 
to  ailitary  psychologists  and  physicians  for  oesupstionsl  sedicine  for 
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examination.  In  cases  of  evident  lack  of  plausibility  the  group 
of  experts  say  revert  to  indicatior.s  obtained  during  phases  II 
and  III.  The  possibilities  of  correction,  e.g.  indications  on 
importance  and  frequency  of  performance  of  a  task,  contained 
therein  say  enable  amendments  in  connection  with  the  indications 
obtained  in  phase  V.  The  results  are  processed  by  ADP  and  stored 
in  the  corresponding  data  files. 

(Slide  26) 

Phase  VII 

During  that  last  phase,  the  final  evaluation  and  the  setting  up 
of  assignment  packages  take  place.  The  results  of  the  determination 
and  registration  of  tasks  and  those  of  the  analysis  of  requirements 
which  must  be  fulfilled  are  summarised.  The  profiles  of  requirements 
which  must  be  fulfilled  with  regard  to  individual  functions  of 
specialised  groups  within  the  armed  forces  are  compared.  These  prefiles 
are  classified  and  arranged  in  packages  after  weighting.  The  descriptions 
of  functions  printed  out  in  ADP  and  the  results  of  cluet-er  analyses 
are  useful  for  the  evaluation  and  enable  the  setting  up  of  assignment 
packages. 

The  results  of  this  final  evaluation  procedure  are  included  .<»  the 
data  file  for  profiles  of  requirements  which  must  be  fulfilled  with 
regard  to  the  functions.  The  possibilities  of  application  by  the 
requesting  agency  are  more  manifold,  e.g.  the  planning  of  uniform 
training  of  the  armed  forces  in  compliance  with  the  requirements,  the 
reasonable  redistribution  of  tasks,  a  steering  in  personnel  matters 
which  is  to  a  large  extent  adapted  to  the  requirements  as  well  as  to 
the  wishes  of  the  individual  soldiers. 
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Thie  is  what  I  wanted  to  say  about  the  instrument  "Funktionaanalyse 
Personalstruktur"*^  and  its  seven  phases* 

On  the  basis  of  these  statements.  Admiral  Fiebig  will  give  you  an 
outlook  on  the  further  development  of  the  MFunktionsanalyse 
Personaletruktur,,+^. 

V*  Outleok 

Hay  1  sum  up  these  statements  by  the  following  valuation t 

Planning  and  organisation  of  the  "Funktlonsanalyse  Fersonalatruktur'1*^ 
have  been  completed. 

(Slide  21) 

le  have  found  out  that  a  total  of  1(54  catalogues  of  questionnaires 
must  be  elaborated* 

47  of  these  catalogues  cover  5(5  to  20(5  military  specialist  tasks  or 
tasks  of  superiors,  and  57  cataloguos  up  to  .50  military  specialist 
tasks*  The  task*  of  tactical  command  and  control  are  included  in  these 
figures. 

By  the  end  of  1976,  a  total  of  28  catalogues  of  questionnaires  on  tasks 
and  listu  of  specific  forms  of  tasks,  that  is  about  one  third  of  the 
recessary  catalogues,  were  completed. 

By  the  end  of  1977,  presumably  about  two  thirds  of  all  catalogues  will 
be  elaborated.  Then  it  will  be  possible  to  represent  about  one  half 
of  all  jobs  of  the  ariied  forces  ss  well  ss  their  functions* 

(Slide  22) 

Analysis  of  the  function*  pertaining  to  s  job 
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For  the  inquiry  on  ACTUALLY  performed  tasks  it  will  be  necessary  to 
interrogate  a  total  of  approximately  54,(5(30  soldiers.  As  on*  team 
interrogates  about  30  soldiers  per  day,  a  total  of  1,800  interrogation 
day*  are  required. 

This  inquiry  on  ACTUALLY  performed  tasks  shall  in  the  main  be 
completed  in  1981* 

But  the  continuous  evaluation  of  intermediate  results  is  possible 
already  no*.  After  completion  of  the  determination  and  registration 
of  tasks,  the  available  data  material  will  furnish  for  instance: 

(flide  23) 

-  task  descriptions  concerning  all  kinds  of  functions  in  the  armed 
forces,  classified  according  to  generic  terms,  hierarchically  arranged 
pursuant  to  the  degree  of  difficulty  of  task  performance  and  to  the 
different  degrees  of  responsibility  connected  with  the  task  in 
question,  subdivided  with  regard  to  directing  responsibility  and 
executing  responsibility, 

-  job  descriptions,  referred  to  the  present  organisation,  with 
Indications  on  all  specific  forms  of  tasks  to  be  executed  in 
the  Job,  and  on  the  distribution  of  the  degrees  of  importance 
of  the  tasks  within  the  job, 

-  indications  on  working  conditions  and  environmental  conditions  as 
well  as  on  working  hours  of  jobs;  particular  strain  and  working 
condition-:  referred  to  tasks, 

-  indications  >n  danger  of  accidents/safety  of  work  ard  on  means 
for  ensuring  that  safety  on  the  jobs, 


^S35S®8Ke!c§s^Ss5ik«s83isasSSs§^  "w  ?&3^'mx&-5*vs*  «&wKWWsKa«««'S®'o*&^^  AttKtee&.ttes eswsswsf*^^ 
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-  indications  on  the  frequency  of  perforaance  of  tasks  in 
noraal  duty  and  during  exercises, 

-  indications  on  licences,  peraits  and  evidence  that  particular 
regulations  have  been  coaplied  with, 

-  indications  on  courses  attended, 

-  indications  on  appointments, 

•  indications  on  perforaance  of  general  tasks  and  on  the 
frequency  of  perforaance, 

-  indications  on  additional  functions  and  on  their  distribution 
to  jobs, 

-  indications  on  contentaent  with  regard  to  occupation* 

The  data  aaterial  is  arranged  in  such  a  way  that  it  enables 

unifora  evaluatioifein  the  arsed  forces* 

(Slides  24  and  25) 

This  becoaes  particularly  clear  when  we  consider  the  so-called 

standard  of  functions  which  includes  inter  alia  the  following 

func lions: 
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As  then*  statements  show,  the  use  of  the  instruments  of 
"Funktionsanalyse  Personalstruktur"*^  has  been  fully  initiated 
and  is  energetically  supported  by  all  services* 

Independently  of  the  organizational  regulations,  the  cooperation 
of  the  troops  remains  after  all  decisive  for  the  success  of  the 
eork.  Above  all  it  is  important  to  interrogate  the  right  specialist 
instructors  at  the  schools  of  the  services,  for  only  then  it  will 
be  possible  to  determine  and  register  reliably  the  tasks  necessary 
for  performing  a  function  and  to  describe  them  according  to  uniform 
criteria.  Beside  the  judgment  of  these  theorists  at  the  schools, 
the  interrogation  of  the  practicians  in  the  troops  is  of  course  of 
equal  importance.  There,  soldiers,  who  know  very  thoroughly  the 
terms  of  reference  in  practice,  have  to  give  the  rele,ant  information* 

Since  Hay  1976,  these  interrogations  have  constituted  a  workload 
of  not  insignificant  an  extent  to  schools  and  troops.  In  this 
connection,  the  irsight  into  the  importance  of  this  work  with  regard 
to  a  new  manpower  structure  of  the  armed  forces  is  preponderant  over 
the  difficulties  which  may  arise  in  certain  cases.  But  again  and  again 
It  becomes  evident  how  important  it  is  to  motivate  in  the  right  way 
the  persons  concerned.  This  aspect  must  not  be  disregarded.  It  even 
becomes  more  and  acre  significant  the  longer  the  work  lasts. 

In  the  meantime,  organization,  flow  diagram  and  time  achedule  have 
been  coordinated  in  such  a  way  that  it  may  be  said  that  on  the  whole 
the  term  originally  pre-planned  by  the  Morpower  Structure  Commission 
for  the  completion  of  the  work  concerning  a  new  manpower  structure, 
the  yew  '.981,  ran  in  the  main  be  complied  with. 
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The  instruments  of  the  analysis  of  functions  which  were  outlined 
to  you  today  and  which  serve  for  elaborating  the  new  manpower 
structure  of  the  armed  forces  of  the  Federal  Eapublic  of  Germany 
constitute  an  essential  prerequisite  for  the  solution  of  numerous 
interservice  tasks  of  the  Federal  Armed  Forces*  For  this  reasonf 
the  Federal  Minister  of  Defense  explicitly  ordered  in  197<>  the 
continuation  of  the  analysis  of  functions.  But  he  did  so  under 
two  premises: 

1.  The  work  may  be  performed  only  as  basic  research  for  the 
Federal  Armed  Forces  themselves* 

2.  Conceptual  models  on  the  evaluation  of  jobs  within  the 
scope  of  the  analysis  of  functions  must  not  have  any 
implications  on  public  service  as  a  whole. 

This  direction  by  the  Federal  Minister  of  Defense  was  necessary 
because  in  our  country  the  Ministry  of  the  Interior  -  not  the 
Ministry  of  Defense  -  is  competent  for  public  service. 

But  independently  of  our  work„  the  competent  department  also 
intensely  deals  already  with  these  problems,  and  I  would  not  exclude 
that  fundamental  results  of  our  investigations  may  gain  validity 
for  other  domains,  too. 

(Slide  25) 

Independently  of  the  aforagoing  we  see  that  the  analysis  of 
functions  will  retain  its  importance  also  in  the  far  future, 
for  the  introduction  of  new  weapon  systems  and  of  new  command 
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and  control  systems  as  veil  as  organisational  changes  must 
again  and  again  be  Analysed,  so  that  it  will  be  possible  to 
dra*  conclusions  within  the  Federal  Araed  Forces,  for  exaaple 
with  regard  to  the  modification  of  the  aanpower  structure  of 
the  araed  forces*  For  this  reason,  the  instrument  M analysis 
of  functions"  is  built  up  as  an  open  aystea  which  possesses 
a  sufficient  flexibility  also  sith  regard  to  future  developaents 
and  changes  or  multiple  kinds* 
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FUNKT10NSANALVSE  PERSONALSTRUKTUR  *J 

=  PART  OF  THE  FUNDAMENTAL  WORK 

FORA 

NEW  MANPOWER  STRUCTURE  OF  THE  ARMED  FORCES 

ON  THE  BASIS 
OF 

RESULTS  OF  INVESTIGATIONS 
CARRIED  OUT  BY  THE 

MANPOWER  STRUCTURE  COMMISSION  OF  THE 
FEDERAL  MINISTRY  OF  DEFENSE 

IN  1971 


)  Analysis  cf  the  functions  pertaining  to  a  job 


MODERN  MANPOWER  STRUCTURE 
MUST  IN  A  BALANCED  WAY  TAKE  INTO  ACCOUNT: 


1.  AIM  OF  ORGANIZATION 


2.  CAPABILITIES 

PERSONAL  CHARACTERISTICS 
PROFESSIONAL  EXPECTATIONS 


OF  THE  INDIVIDUAL 
SOLDIER 


3.  TECHNICAL 

POLITICO-ECONOMIC 

SOCIAL 


DEVELOPMENTS 


m 
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FUNKTIONSANALVSE  PERSONALSTRUKTUR  #> 

SERVES  INTER  ALIA  FOR 

1.  SYSTEM-ORIENTED  PERSONNEL  ORGANIZATION 

2.  PERSONNEL-ORIENTED  iOB  CHARAC  TERISTICS 

3.  ENABLING  COMPARISON  OF 

-  FOR  REQUIREMENTS  WHICH  MUST  EE  FULFILLED 


PROFILES 


-  FOR  QUALIFICATIONS 


4.  BUILDING  UP  OF  ASSIGNMENT  PACKAGES 


5.  WEIGHTING  OF  FUNCTIONS 


*)  Analysis  of  the  functions  pertaining  to  a  job 
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PROBLEM 

RECORDING  OF  CORRELATION 
BETWEEN  PERFORMANCE  AND  EFFICIENCY 


IMPORTANT 


BECAUSE: 

CORRELATION  BETWEENPERFORMANCE  AND 
EFFICIENCY 

*  CHARACTERISTIC  OF  UNITS  AND 
SUB-UNITS  ORGANIZED  ACCORDING 
TO  THE  PRINCIPLE  OF  DIVISION  OF  LABOR 


l)tb. 


APART  FROM  THE  REPORT  OF  THE  MANPOWER  STRUCTORE  COMMISSION 
REPORTS  WERE  ALSO  SUBMITTED  BY 

1.  THE  FORCE  STRUCTORE  COMMISSION 
(ON  EQUITY  IN  INDUCTION) 

2.  THE  EDUCATION  COMMISSION 

(ON  REORGANIZATION  OF  EDUCATION  AND  TRAINING) 

HEREWITH  SOMETHING  LIKE  A  „LESS  APPROPRIATE  TIMING  " 

I 

WAS  CREATED 


SLIDE  S 


FUNKTIONSANALYSE  PERSONALSTRUKTUR  *> 

FIRST  TESTING  IN  1373 
WITH  3,400  SOLDIERS 

FROM  THE  FOLLOWING  FIELDS  /  SPECIALIZED  GROUPS: 

-  INFANTRY  EMPLOYMENT 

-  AERONAUTICAL  EMPLOYMENT 

-  NAUTICAL  EMPLOYMENT 

-  PERSONNEL  ADMINISTRATION 

AT  THAT  OCCASION 

THE  FUNCTIONS  PERTAINING  TO 

90  MILITARY  OCCUPATIONAL  SPECIALTIES 
OF  ALL  SERVICES 

WERE  RECORDED 

Analysis  of  the  functions  pertaining  to  a  job 
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EVALUATION 

FIRST  TESTING 

DEFICIENCIES 
STILL  EXISTING: 

-  TOO  COMPREHENSIVE  DATA  MATERIAL 

-  NEGLECT  OF  CRAFTSMANSHIP 

-  NO  LEADS  WITH  R  EGARD  TO  LEVELS  OF 
OBJECTIVES  OF  LEARNING 
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Flow  Diagram 


..Funktionsanalyse  Personalstruktur” 
(Analysis  of  the  functions  pertaining  to  a  fob) 


Determination  and 
registration  of  tasks 


Phase  I  ©  -  ® 

Preparatory  work 
Drafting  o';  task  catalogues 


Phase  !l  ®-@ 

Determination  and  registration 
of  ACTUALLY  performed  and 
of  AUTHORIZED  tasks 


SI 

Comparison  AUTHORIZED/ 
ACTUAL 


job  description 


Description  of  Unctions 


Analysis  of  requirements 
which  must  be  fulfilled 


Phase  IV  (21)  -  (26) 

Preparatory  work 

Setting  up  of  material  collection 


Knowledge,  skills 
Assignment  with  regard  to 
characteristics 
..Particular  strain/working 
conditions",  capabilities 


Phase  VI 

Evalution,  control 
Drafting  of  requirements  which 
must  be  fulfilled  with  regard  to 
specified  tasks 


Phase  VII  (34)  - 

Evaluation,  comparison  ^ 
Cluster  analyses 


Profiles  for  requirements 
which  must  be  fulfilled 


Assignment  packages 


SHOE  9 


s 


FUNCTION 

*  TOTALITY  OF  TASKS  (JOB! 

TASK 

-  MAIN  ELEMENT  OF  DESCRIPTION 
UNIFORM  DESCRIPTION  OF  TASKS 

*  PREREQUISITE  FOR  COMPARABILITY 
OF  FUNCTIONS 
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5  GROUPS  OF  TASKS 

1.  TASKS  OF  TACTICAL  COMMAND  AND 

2.  MILITARY  SPECIALIST  TASKS 

3.  GENERAL  TASKS  OF  SUPERIORS 

4.  GENERAL  TASKS 

5.  ADITIONAL  TASKS  /  FUNCTIONS 


ONTROL 


l)9i 


SLIDE  11 


EXAMPLE  CONCERNING  RULE  1 


AS  TO  THE  FORMULATION  OF  THE  INDIVIDUAL  TASKS,' THE  MODE  OF 


EXPRESSION  USED  SHOULD  BE  AS  UNIFORM  AS  POSSIBLE 


(VERB  +  PREDICATE) 

1 
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EXAMPLE  CONCERNING  RULE  2 


THE  FORMULATIONS  OF  TASKS  CONSTITUTE  LEADS  FOR  THE  SERVICEMEN  TO 
BE  INTERROGATED  WHICH  FACILITATE  THE  ANSWERING  OF  QUESTIONNAIRES 
ON  TASKS,  FOR  THIS  REASON  IT  MUST  BE  AVOIDED  THAT  THE  INTERROGEES 
FAIL  TO  ANSWER  THE  APPLICABLE  QUESTIONNAIRE  ON  TASKS  FOR  THE  ONLY 
REASON  THAT  THE  FORMULATION  OF  TASKS  IS  AMBIGUOUS  OR  NOT  USUAL. 


NOMBSStae* 


,  SLIDE  13 


EXAMPLE  CONCERNING  RULE  3 


A  TASK  SHALL  CONSTITUTE  A  WORKING  CYCLE  /  COMPLEX  CLEARLY 
DELINEATED  WITH  REGARD  TO  TIME  REQUIRED  AND  SPECIALTY. 

SEVERAL  JOB  HOLDERS  OF  DIFFERENT  LEVELS  MAY  PARTICIPATE  IN 
PERFORMING  A  TASK. 

NOTE:  ONE  JOB  HOLOER  MUST  BE  IN  A  POSITION  TO  PERFORM  THE 

SPECIFIC  TASK  DEFINED;  FOR  THIS  REASON  IT  MUST  NOT  CONSTITUTE 
THE  TASK  OF  A  UNIT  /  SUB-UNIT. 


1)93 
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COMPLEXITY 


*  CHARACTERISTIC  OF 


-  CONTENTS 

-  DEGREE  OF  DIFFICULTY 


OF  A  TASK 


EXECUTING  RESPONSIBILITY 


=  RESPONSIBILITY  FOR  ACTING  BY 
APPLICATION  OF  PROCEDURES  OR 
METHODS 


DIRECTING  RESPONSIBILITY 


=  RESPONSIBILITY 

-  FOR  INFLUENCING  BEHAVIOR  OF 
OTHER  PEOPLE 

AND 

-  FOR  PURPOSE-ORIENTED  USE 
OF  MATERIAL 


U9h 


Determination  and  Registration  of  Tasks  Phase  I1 1  (ACTUAL) 


(Evaluation) 
Phase  III 
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V 


Determination  and  Registration  of  Tasks  Phase  Hi  (Evaluation) 


Immediate  measures,  application,  information  (20. 
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CAPABILITY 

=  LASTING  PHYSICAL  AND  PSYCHICAL  ABILITY  OF 
A  PERSON  TO  ACQUIRE  OR  APPLY  KNOWLEDGE 
AND  SKILLS 


KNOWLEDGE 

*  ATTAINMENTS  NECESSARY  FOR  CARRYING  OUT  A 
FUNCTION  (PRIMARILY  THEORETICAU 


SKILLS 


a  ABILITIES  NECESSARY  FOR  CARRYING  OUT  A 
FUNCTION  (PRIMARILY  PRACTICAL) 


llu. 


SLIDE  20 


Funktionsanalyse  Personalstruktur  Phase  VII  ( Analysis  of  functions  pertaining  to  a  job) 
(Evaluation  and  determination  of  assignment  packages) 


Results  obtained 
in  Phase  Vi 


c> I® 


Assignment  packages 


Evaluation 

-  Comparison  of  profiles  of  requirements  which 
must  be  fulfilled  for  individual  functions  — — 

-  Summarizing  of  profiles  of  requirements  which 
must  be  fulfilled,  classification  and  weighting 
in  packages 


.  Description  of  functions 


,  Data  file 
on 

profiles 
of 

requirements 
which  must  be 
fulfilled  with 
regard  to 
functions 


6)  Information 
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ALTOGETHER  NECESSARY: 

-  104  CATALOGUES  OF  QUESTIONNAIRES  ON  TASKS, 

EACH  CATALOGUE  CONTAINING  UP  TO 

.  200  TASKS 

BY  THE  END  OF  1976 

-  28  CATALOGUES  AND  LISTS  CONTAINING  DEFINITIONS 

OF  SPECIFIC  TASKS 
WERE  COMPLETED 

BY  THE  END  OF  1977,  PRESUMABLY 

-  TWO  THIRDS  OF  ALL  CATALOGUES  WILL  BE  COMPLETED 

-  A  DESCRIPTION  OF  50  %  OF  ALL  JOBS/  FUNCTIONS  WILL 
BE  POSSIBLE 
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FOR  INQUIRY  ON  ACTUALLY  PERFORMED  TASKS 
ARE  NECESSARY: 

-  54,000  SOLDIERS 

-  1,800  INTERROGATION  DAYS  (1  TEAM) 

PLANNED  FOR  1978: 

-  5  TEAMS 

1981  PRESUMABLY 

-  TERMINATION  OF  THIS  INQUIRY 
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INTERMEDIATE  RESULTS 


POSSIBILITY  OF  CALLING  AT  ANY  TIME  INTER  ALIA: 


-  TASK  DESCRIPTIONS  ACCORDING  TO  GENERIC 
TERMS 


-  JOB  DESCRIPTIONS  ACCORDING  TO  P  RESENT 
ORGANIZATION 


MANY  INDIVIDUAL  QUESTIONS  (COMPARE 
SLIDES  24  AND  25) 


SLIDE  24 


STANDARD  OF  FUNCTIONS 


1 .  Identification.  for  example: 

number  In  TOAE,  designation  in  TOM 

appointfMnt 

own  forces  /  meant 

position  is  superior 

security  classifications,  clearance 

knowledge  of  foreign  languages  (according  to  coda  of  the 

Federal  Armad  Fnreas) 


2.  Dasefiotiva  characteristics.  for  example: 
main  tasks 

oparational  command  and  control  tasks 
tactical  command  and  control  tasks 
general  tasks  of  superiors 
knowledge,  skills 
knowladga  of  objtcts 
knowladga  of  fundamentals 
environmental  /  working  conditions 
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3.  EmfllflymtOt.tninlM,  for  example: 

ntcvmry  previous  military  employment 
desired  previous  military  employment 
necessary  courses 

possible  further  military  employment  (horizontal) 
possible  further  military  employment  (vertical) 
transition  to  the  civilian  sector 

4.  Evaluatine  characteristics,  for  example: 
minimum  enlistment  period  of  reeniisted  men 
area  of  responsibility 

level  of  responsibility 


OCCUPATIONAL  TASK  FACTORS  FOR  INSTRUCTIONAL  SYSTEM  DEVELOPMENT 


By 

William  Stacy 
Nancy  Thompson 
Sq  Ldr  David  Thomson 


Occupation  and  Manpower  Research  Division 
Air  Force  Human  Resources  Laboratory 
Brooks  AFB,  TX 

I ,  INTRODUCTION 


The  Air  Force  Human  Resources  Laboratory  has  been  engaged  in  a 
long  range  research  effort  to  determine  task  training  requirements  based 
upon  occupational  survey  data.  The  methodology  being  developed  is  in 
support  of  tho  Air  Force  Instructional  Systems  Development  (ISO)  program. 
Specifically,  occupational  task  factors  are  being  developed  and  utilized 
to  provide  data  in  meeting  the  ISO  requirements  of  analyzing  the  system 
or  Job  requirements  and  defining  educational  training  needs  based  upon 
Job  performance  requirements. 

The  primary  consideration  for  including  or  excluding  a  task  for 
training  has  been  based  v,;on  occupational  survey  data  of  the  probability 
of  airmen  performing  certain  tacks  in  their  first  job  assignment.  Changes 
in  curriculum  design  based  upon  these  types  of  studies  have  saved  the 
Air  Force  millions  of  dollars  by  eliminating  training  on  tasks  performed 
by  low  percentages  of  flrst“term  airmen  in  the  field.  However,  the 
probability  of  the  performance  of  a  task  may  not  be  the  only  factor  to  be 
considered  in  establishing  training  requirements.  For  example,  if  the 
consequences  of  inadequate  performance  of  a  task  are  hazardous  or  costly 
to  human  life  or  property,  that  task  should  be  considered  for  training 
regardless  of  the  frequency  of  performance.  In  recent  years,  AFHRL  has 
hypothesized  and  researched  a  number  of  task  training  factors  Important 
In  determining  how  much  emphasis  should  be  placed  on  tasks  for  training. 

The  basic  theory  behind  the  Air  Force  task  training  factor  research 
wee  conceived  end  reported  by  Christal  (1970).  In  the  design  of  the 
research  a  number  of  current  task  factors  have  been  identified  which  can 
provide  training  development  personnel  with  an  objective  procedure  for 
determining  the  taak''s  priority  for  training.  Data  are  currently  being 
collected  on  the  following  task  factors: 


a.  Percent  members  performing.  TI.c  percent  of  airmen  in  the 
career  specialty  performing  the  task  in  the;r  jobs. 

b.  Task  difficulty.  The  time  required  to  learn  to  perform  the 
task  satisfactorily. 

c.  Consequences  of  inadequate  performance.  The  perceived  consequences 
if  the  task  is  incorrectly  performed,  considered  in  terms  of  destroyed 
material,  wasted  time,  injury,  or  loss  of  life. 

d.  Task  delay  tolerance.  A  measure  of  how  much  delay  can  be  tolerated 
between  the  time  the  airman  becomes  aware  the  task  is  to  be  performed  and 
the  time  he  must  commence  doing  it. 

c*  field  recommended  training  emphasis.  A  measure  of  the  task's 
recommended  formal  training  emphasis  (either  school  or  OJT),  based  upon  the 
ratings  of  tasks  by  7-  and  9-  skill-level  field  NCOs. 

f.  School  training  emphasis.  A  measure  of  the  task's  current  training 
emphasis  in  resider.c  training,  based  upon  the  ratings  of  tasks  by  course 
instructors. 

The  task  training  factor  methodology  has  many  potential  applications, 
for  each  of  the  task  factors  under  consideration,  the  task  from  occupa¬ 
tional  surveys  can  be  ordered  in  sequence.  For  example,  the  occupational 
purvey  tasks  can  be  ordered  In  descending  sequence  baser1  upon  the  arith¬ 
metic  mean  ratings  of  consequences  of  inadequate  performance.  Information 
may  also  be  provided  comparing  school  training  emphasis  with  field  recom¬ 
mended  training  emphasis.  In  addition,  using  multiple  regression  analysis, 
the  task  factors  can  be  used  as  predictor  variables  to  capture  the  judg¬ 
ments  of  school  training  personnel  and  recommended  training  emphasis  from 
the  field.  The  task  factors  can  be  applied  in  the  selecting  of  tasks  for 
training  courses,  the  development  of  specialty  training  standards,  the 
validation  of  current  training  courses,  or  for  the  redesigning  of  an 
existing  course  as  a  result  of  changes  in  course  length  or  changes  in  the 
career  specialty .  The  task  training  factor  methodology  does  not  provide 
all  answers  to  training  course  decisions;  the  methodo'ogy  is  primarily 
ns  an  advanced  aid  to  course  design  and  is  subject  to  override  by  other 
training  considerations  ns  required. 

The  development  of  task  training  priority  factors  was  described  by 
Mead,  (1975)  at  the  17th  annual  conf or v. tee  of  the  Military  Testing 
Association.  In  Mead's  paper,  procedures  for  validating  task  factors 
were  illustrated  by  two  studies.  In  one  validation  study  (Mial  &  Chriatal, 
1974),  one  hundred-ninety  first-term  airman  tasks  in  the  Medical  Services 
(902X0)  career  specialty  were  placed  on  4"  X  6"  cards  and  were  rank 
ordered  by  curriculum  specialists  according  to  their  priority  for  resident 
technical  training.  The  mean  rank  values  served  as  the  criterion  measure 
and  were  predicted  by  the  task  factors  using  multiple  regression  analysis. 
This  policy  capturing  procedure  resulted  in  a  four  variable  task  factor 
equation  which  correlated  R  *  .88  with  those  of  the  curriculum  specialists. 


A  second  validation  study  was  conducted  by  Head,  (1975)  using 
the  Law  Enforcement  (811XX)  specialty.  Training  curriculum  specialists 
rank  ordered  a  representative  sample  of  165  apprentice  and  journeyman 
level  tasks  as  to  their  priority  for  formax  training.  The  mean 
rank  values  for  each  task  were  used  as  the  criterion  values.  The. 
training  priority  policy  of  the  curriculum  specialists  was  predicted 
using  four  task  factors'  and  derivations  of  these  factors.  The 
equation  provided  training  priority  values  which  correlated  R  * 

.95  (P^  -  ,91)  with  the  criterion  values.  These  studies  strongly 
indicated  that  the  task  training  policies  of  curriculim  specialists 
could  be  duplicated  mathematically  using  the  task  factor  prediction 
model . 


II.  CURRENT  TASK  FACTOR  RESEARCH  PROCEDURES 


Two  types  of  scales  are  currently  applied  in  the  collection  of 
ratings  on  task  difficulty,  task  delay  tolerance,  and  consequences  of 
inadequate  performan  e.  Relative  task  factor  scales  have  .been  used 
to  rate  each  task  in  a  career  specialty  relative  to  the  other  tasks 
in  that  specialty.  Benchmark  or  task-anchored  scales  have  also  been 
designed  which  are  used  to  rjte  the  tasks  in  one  specialty  compared 
with  prescribed  levels  of  tasks  in  other  Air  Force  specialties. 

A  major  limitation  of  the  relative  scales  is  that  the  tasks  in  s 
career  specialty  are  only  rated  In  a  context  with  the  other  tasks  in 
that  specialty. 

For  example,  using  a  relative  scale,  a  cook  may 
rate  "salting  the  meat"  very  high  in  consequences 
of  inadequate  performance.  Yet,  in  a  task -anchored 
scale,  the  consequences  of  inadequate  performance 
of  "salting  the  meat"  may  not  seem  very  serious 
when  compared  with  inadequate  performance  of  other 

*  Air  Force  tasks. 

In  the  past  several  years  AFHRL  has  demonstrated  considerable 
success  in  the  development  of  task-anchored  rating  scales.  The  purpose 
of  the  task-anchored  scales  Is  to  permit  comparisons  of  tasks  within 
one  career  specialty  with  representative  tasks  performed  in  other  Air 
Force  specialties.  Task  -anchored  scales  have  been  developed  for  three 
task  factors:  task  difficulty,  consequences  of  inadequate  performance, 
and  task  delay  tolerance.  The  scales  were  also  developed  to  represent 
three  aptitude  areas:  a  combined  administrative  or  general  requirement; 
electronic  aptitude;  and  mechanical  aptitude.  An  illustrate  ot  an 
adrainistrative/general  aptitude  task  anchored  scale  is  presented  in  Appendix 
A.  The  scale  is  composed  of  27  tasks,  subdivided  into  nine  subgroups  of 


^The  term  benchmark  scales  has  recently  been  changed  to  task-srichorad 
scales  to  provide  additional  clafity. 
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thtee  tasks.  Each  subgroup  represents  one  of  nine  levels  on  the  scale. 

In  actual  use,  raters  compare  eatn  of  the  tasks  in  their  specialty 
against  the  27  tasks  in  rh~  benchmark  scale,  For  each  task  in  his 
specialty  the  rater  will  decide  which  level  and  subgroup  of  three 
tasks  arc  the  iiost  similar  on  the  factor  being  considered.  The  develop* 
went  of  the  benchmark  scales  for  the  Administrative,  General  aptitude 
a-ea  has  been  reported  by  Goody  (1976).  The  development  of  the  electronic 
and  mechanical  benchmark  scales  has  also  been  completed  but  not  reported. 

In  Appendix  B,  inter-rater  agreement  of  illative  and  task-anchored 
(benchmark)  scale  ratings  is  reported  using  the  intra-class  correlation 
technique  (Lindquist,  1953,  p.  361).  From  these  reliability  estimates 
of  mean  ratings  it  appears  that  the  task  factors  used  in  the  studies 
arc  generally  stable  and  reliable  measures.  It  has  been  demonstrati a 
for  the  nwist  part  that  the  reliability  estimates  of  mean  ratings 
obtained  from  experienced  NCOs  are  acceptably  high  and  consistent. 

In  Appendix  C,  for  an  additional  comparison  on  rater  agreement 
between  relative  and  task -anchored  (benchmark)  scales,  four  career 
spec  laities  have  beo.i  surveyed  with  botft  scales  and  their  sample 
size  adjusted  to  a  common  N  -  50.  It  appears  generally  that  the 
benchmark  scale  raters  were  as  good  or  slightly  better  in  th.  ir 
agreement  wi;h  each  other  than  the  relative  scale  raters.  The  percentage 
of  deviant  raters  who  were  deleted  from  the  study  because  they  used 
the  scale  upsld*  d'.»wn  or  <iid  not  properly  follow  instructions,  was  notice¬ 
ably  less  for  the  benchmark  scale  raters. 

For  several  reasons,  future  research  studies  will  probably  uac 
benchmark  scalos  in  collecting  task  factor  data.  The  benchmark  scales 
arc  more  advantageous  th-^n  relative  scales  because  they  provide  more 
information  for  making  training  decisions  between  career  apecialtite, 
the  benchmark  scales  appear  to  provide  as  good  or  better  agreement  among 
raters  than  relative  armies,  and  require  fewer  case  deletions.  The 
benchmark  scales  can  also  be  used  to  rank  order  training  priorities  of 
tasks  within  a  single  career  specialty. 

Appendix  provides  correlations  of  school  and  field  training 
emphasis  versus  the  >  mk  factor.,  fo”  twelve  career  specialties.  The 
task  factor  d.tn  wera  collected  with  relative  scales.  One  Important 
correlation  in  Appendix  1>  is  the  zero-order  correlation  between  school 
training  emphasis  and  field  recommended  training  emphasis.  In  some 
of  the  career  specialties  there  was  fairly  high  correlation  between 
what  the  field  recommends  to  be  taught  on  Job  tasks  and  what  the  school 
Is  currently  teaching.  In  other  career  specialties  the  correlations 
between  school  and  field  training  emphasis  were  not  as  high.  In  inter¬ 
preting  Appendix  D,  it  appears  that  vhen  correlations  between  school 
aid  field  training  emphasis  were  low,  the  correlation  of  school  emphasis 
with  task  factors  were  also  relatively  low  compared  to  correlations  of 
field  emphasis  with  task  factors,  in  such  «  case,  it  would  seem  that 
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three  tasks.  Each  subgroup  represents  one  of  nine  levels  on  the  scale. 

In  actual  use,  raters  compare  each  of  the  tasks  in  their  specialty 
against  the  27  tasks  in  the  task  anchored  scale.  For  each  task  in  his 
specialty  the  rater  will  decide  which  level  and  subgroup  of  three 
tasks  are  the  most  similar  on  the  factor  being  considered.  The  develop¬ 
ment  of  the  task  anchored  scales  for  the  Administrative,  General  aptitude 
area  has  been  reported  by  Goody  (1976).  The  development  of  the  electronic 
and  mechanical  task  anchored  scales  has  also  been  completed  but  not  reported. 

In  Appendix  B,  inter-rater  agreement  of  relative  and  task-anchored 
(benchmark)  scale  ratings  is  reported  using  the  intra-class  correlation 
technique  (Lindquist,  1953,  p.  361).  From  these  reliability  estimates 
of  mean  ratings  it  appears  that  th^  task  factors  used  in  the  studieo 
are  generally  stable  and  reliable  measures.  It  has  been  demonstrated 
for  the  most  part  that  the  reliability  estimates  of  mean  ratings 
obtained  from  experienced  NCOs  are  acceptably  high  and  consistent. 

In  Appendix  C,  for  an  additional  comparison  on  riter  agreement 
between  relative  and  t^.k-anchored  (benchmark)  scales,  four  career 
specialties  have  been  surveyed  with  both  scales  and  their  wample 
slse  adjusted  to  a  common  N  »  SO.  It  appears  generally  that  the 
bechchaark  scale  raters  were  as  good  or  slightly  better  In  their 
agreement  with  each  other  than  the  relative  scale  raters.  The  percentage 
of  deviant  raters  vho  were  deleted  from  the  study  because  they  used 
the  scale  upside  down  or  did  not  properly  follow  instructions,  was  notice¬ 
ably  less  for  the  task  anchored  scale  rateru. 

For  several  reasons,  future  research  studies  will  probably  use 
task-anchored  scales  in  collecting  talk  factor  date.  The  benchmark  scales 
are  more  advantageoom  than  relative  scales  because  they  provide  more 
information  for  making  training  decisions  between  career  specialties, 
the  task  anchored  scales  appear  to  provide  as  good  or  better  agreement  Among 
raters  than  relative  scales,  and  require  fewer  case  deletions.  The 
task  anchored  scales  can  also  be  used  to  rank  order  training  priorities 
of  tadka  within  a  single  career  specialty. 

Appendix  D,  provides  correlations  of  school  and  field  training 
emphanls  versus  the  task  factors  for  twelve  career  specialties.  The 
task  factor  data  were  collected  with  relative  scales.  One  Important 
correlat  ■»  ir»  Appendix  D  is  the  xaro-order  correlation  between  school 
training  emphasis  and  field  recommended  training  emphasis.  In  soma 
of  the  career  specialties  there  was  fairly  high  correlation  between 
what  the  field  recommends  to  be  taught  on  Job  taska  and  what  the  school 
is  currently  teaching.  In  other  career  specialties  the  correlations 
between  school  and  field  training  emphasis  were  not  as  high.  In  inter¬ 
preting  Appendix  D,  it  appears  that  whan  correlations  between  school 
and  field  training  emphasis  were  low,  the  correlation  of  school  emphasis 
with  task  factors  were  also  relatively  low  compared  to  correlations  of 
field  emphasis  with  task  factors.  In  such  a  case,  i~  vculd  seen  that 
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the  school  was  giving  leas  consideration  to  the  task,  factors  thaa 
was  the  field.  However,  low  correlations  he  tween  school  ot*4 
ratings  do  not  necessarily  mean  inadequacies  in  school  curriculum ; 

The  schools  arc  operating  under  a  nueber  of  constraints  as  to  vhqt 
can  be  taught  in  resident  training  and  field  ratings  sre  influenced 
by  consideration  of  OJT  as  well  as  resident  training. 

The  analysis  and  display  of  Air  Fores  task  factor  data  la 
best  accomplished  through  the  CODAP  analysis  system.  CODAP  is  a 
comprehensive  set  of  computer  programs  for  analyzing  and  reporting 
occupational  information  collected  with  job  inventories  (Cnriatal  & 
Weissmuller,  1/76).  One  of  the  most  frequently  used  CODAP  programs 
to  illustrate  task  factor  data  is  called  FACSUM.  Appendix  E  presents 
a  FACSUM  difference  description  between  school  and  field  training 
emphasis  in  the  Medical  Services  (902X0)  career  specialty.  The  top 
of  the  description  shows  tasks  which  have  the  largest  difference 
between  what  is  recommended  by  the  field  for  training  and  what  is 
currently  being  taught  In  the  school.  Tasks  are  listed  in  descending 
sequence  of  these  differences.  The  tasks  listed  at  the  bottom  of  the 
description  are  those  on  which  the  school  emphasis  waa  greater  than 
the  emphasis  recommended  bv  t ho  field. 

The  task  factor  data  deyeloped  in  the  Medical  Services  study  proved 
to  he  extremely  valuable  in  aiding  ISO  ^nd  training  development  peraonnel 
in  making  changes  in  the  resident  course  hssed  upon  the  task  factor 
information. 

The  power  of  the  task  factor  data  in  aiding  training  development 
personnel  can  he  further  evidenced  in  the  Dental  Specialists  (981XQ) 
training  course.  From  examining  a  FACSUM  difference  deacription 
between  school  and  field  recommended  training  emphasis  there  were  two 
tasks  which  appeared  at  the  top  of  the  description. 
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Based  upon  the  difference  in  training  emphasis  between  school  end 
field  and  the  lew  percentages  of  first-term  airmen  performing  tne  tasks 
the  school  investigated  the  status  of  these  two  tasks.  It  was  deter¬ 
mined  that  the  two  tasks  we;c  being  performed  primarily  in  e  laboratory 
function  and  were  no  longer  considered  as  routine  tasks  in  the  field. 
The  training  of  these  two  tasks  in  the  resident  course  was  suspended 
representing  a  savings  of  approximately  fourteen  and  one-halt  hours  in 
the  training  cousse  plan  of  instruction  (POI). 
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III.  SUMMARY 

The  Cask  training  factor  methodology  developed  by  AFHRL  has 
demonstrated  many  applications  in  determining  the  training  priorities 
of  tasks  based  upon  occupational  survey  data.  The  primary  benefits 
of  this  research  have  been  in  the  use  of  task  factor  data  for  designing 
training  courses,  the  development  of  specialty  training  standards, 
the  validation  of  current  training  courses,  and  for  the  redesigning 
of  an  existing  course  as  a  result  of  changes  in  course  length  or  changes 
of  th«  career  specialty.  The  development  of  the  task-anchored  scales 
has  opened  an  additional  area  of  research  for  exploring  differences  in 
training  emphasis  and  priority  betveen  Air  Force  career  specialties. 

The  recent  collection  of  recommended  task  training  requirements  from 
the  field  and  of  Information  concerning  current  training  emphasis  in 
the  schools  has  provided  meaningful  comparisons  which  can  be  used  in 
determining  the  priorities  of  tasks  for  training.  In  attempting  to 
capture  the  policles'-of  field  and  train  in;,  school  personnel  in  establishing 
training  requirements,  task  factors  have  been  used  ao  predictor  variables 
in  multiple  regression  analysis.  The  task  factor  methodology  although 
still  in  the  developmental  process,  has  been  demonstrated  to  be  extremely 
valuable  in  the  curriculum  design. 
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APPENDIX  A 


TASK  DELAY  TOtERAMCE 

(AdntiabWiUeWOanaeitt 

DEFINITION 

The  TtA  May  Toleiance  of  a  talk  H  •  maapue  of  how  much  (May  can  ba  tolerated  batwean  tha  time  tha  ebmen  become! 
jvsjrr  th«  task  it  to  be  performed  ant)  tha  lima  ha  mu*  comma nca  doing  it. 

BENCHMARK  SCALE 

Law!  1  -  Laaat  'otarence  af  (May  -  Muat  da  Immadiataty 

Um  artificial  rwpkation  to  rntor*  breathing  of  locidanl  at  fir*  victimt  ((  ha  Protection  Spadatitt) 
tivi*  tciamble  or  ti#<i  to  fighttt  aircraft  (Command  am)  Control  Spaciallit) 

A  vim  during  traatmar.t  of  cardio-r*tptr*<aiv  taihjra  in  operating  room  (Operating  Room  Sp*cia!ltt) 

Lmat  2 

Quell  tktturbencat  involving  intlitary  tNHtonnal  (Sacutitw  Specialist) 

klentdy  tablets.  capsules  or  hqukh  involved  In  poaoning  taw  (Pharmacy  Spadatitt  I 

Oparate  tafaty  console  at  mitul*  control  renter  during  haraidout  operations  (Miatlla  Sateiy  Specialist) 

Laval  3 

Impact  runway  for  foreign  objects  lAit  Operations  SpadalHtl 
Administer  anaesthesia  In  danul  surgery  (Denial  Spadatitt) 

A«|u<t  akbotn*  radio  racaivaM  to  obtain  innVatla  signed  (Radio  Opaiatorl 

Laval  4 

Ou ration  tu  meets  or  wilneeiet  (Security  Specialty) 

Pd  form  colorry  conn  it  on  iMdtvui  to  minute  ty|t»  and  level  ul  infection  (Meo-cal  Laboratory  Spadatitt) 

LAil.-  tain  proper  temperature  ol  loud  ttO'aga  areas  (Cook) 

Laval  6 

Idrotily  military  vehicles,  mttalUtrom  or  ectivities  m  vauaf  photogiafta  (Intelligence  Oparatlont  Specialist) 

Proof laad  or  ronacl  nketyp*  tap#  nr  paga  coplat  (Communication!  Cratlar  Specialist) 

Pr  apart  daily  weather  maps  (Weather  f  or  cot  tar  Special  it  l! 

Laval  • 

Operate  Pvmpute'  remote  iotiuky  tannirodi  (Computer  Oparator) 

Purge  r<  dtr  chemical  Mrvat  In  film  dewAtpkig  machine*  (Stkl  Photographic  Laboratory  Speciektt) 

$*rv<*  a. hi  .maintain  danul  Mghvpeed  drilling  mulpifent  (Dental  laboratory  Specialist) 

Laval  7 

Monitor  vorkloarl  reporting  systems  (Mantrowtr  SpactaKst) 

Brief  pergonnat  on  tlata  or  kxat  mnmi  traffic  laws  (Safety  Speoditl) 

Draw  up  work  rod  art  tor  tax  operators  ')■  drived  on  Largta  An  Fore*  beta  (Program!  and  Work  Control  Specialist) 

Laval  8 

Write  dam  klrnttlieailcn  detcnpttons  and  tpacdicatlont  lot  catalogue;  (Procurement  SpaciaKtt) 

Interview  or  Nr*  civilian  partonnal  (Supply  Service  Spaciafitt) 

PtetMia  and  analyte  work  flow  {hoc via  chart!  (Management  Engineering  Special!** 

level  •  -  Mott  Tolerant  al  (May  -  Do  when  reedy 

Review  or  tviact  bock!  or  publication!  for  unit  Ida  ary  ( Administration  Specialitt) 

Research  and  writ*  taalur*  stones  in  Aii  Force  publication!  (Information  Spactalitt) 

Claan  ttvih  of  animalt  (Vertet  inary  Spnciektt) 
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RELIAaxLITY  OF  TASK  FACTO*  HATXSGS 


APPENDIX  D 
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APPENDIX  E 

FAC  SUM  Difference  Description  Between  FieW  «nl  School 
Training  Emphasis  Medical  Services  903X0 
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Suture  Lacerations 

6.31 

.00 

6.31 

23.6 

5.83 

3.52 

6.55 

G  295 

Insert  Oral  Airways 

6.61 

.30 

6.31 

24.3 

7.10 

2.06 

5.36 

l  378 

Administer  Primary  Care 

at  Scene  of  Accidents 

6.99 

1.35 

5.64 

23.9 

7.41 

1.87 

6.44 

1  391 

Drive  Ambulances  or 

Ambuscs 
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.00 
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20.8 
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4.32 

4.% 

1  41! 

Remove  Sutures 

5.56 

.00 

5.56 
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4.40 

5.96 

4.39 
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Prepare  Plaster  of  Paiis. 

Cotton,  or  Other  Materials 

for  Fabrication  of  Casts 

5.70 

.20 

5.50 

23.5 

4.19 

5.00 

4.92 

F  174 

Ptcpatc  Patients  for 

* 

Minot  Surgery 

6  08 
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40.5 

4.99 

4.74 

4.80 

0  252 

Apply  Pneumatic 

Splints 

*5.48 

.05 

5.43 

!  1-4 

5.18 

4.34 

4.45 

G  254 

Apply  Short  leg  0  aster 

Casts 

5.39 

.00 

5.39 

15.5 

5.21 

5.39 

5.42 

G  253 

Apply  Short  Arm  Plaster 
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.00 
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16.8 
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5.40 

5.40 

G  280 

Draw  Blood  from 

Patients 

5.38 

.00 

5.38 

?9.5 

5.58 

4.46 

5.60 

G  248 

Apply  Make-Sliift 

Splints 

5.59 

in  Ui/ML 
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18.9 

5.25 

4.21 

4.84 
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Orient  New  Patients 
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4.90 
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5.23 
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Apply  .Seat  by  Hot  Water 

Bottles 

4,90 

5.40 

-SO 

32.8 

4.85 

5.37 

3.39 
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Instruct  Patients  in  Crutch 

Walking 

4.95 

5.50 

-.55 

32.8 

4.83 

5.42 

4.60 

G  312 

Perform  Oral  Hygiene 

5.21 

5.85 

.64 

45.4 

4.06 

5.16 

4.18 

G  202 

Administer  Complete  Bed 

Baths 

5.55 

6.25 

-  70 

51.2 

3.00 

5.40 

3% 

H  361 

Make  Occupied  Beds 

5.07 

6.00 

-.93 

54.6 

3.57 

5.47 

4.02 

G  30<> 

Participate  in  Team 

Conferences 

4.10 

5.10 

1.00 

37.5 

3.71 

6.W 

4.44 

G  284 

Give  Back  Rubs 

4.54 

5.55 

-1.01 

45.3 

3.14 

6.19 

3.78 

H  362 

Make  Postoperative  or 

Recovery  Beds 

4.°2 

6.00 

-1.08 

51.0 

3.13 

4.91 

3.6S 

G  199 

Administer  \kd  Pans  or 

Urinais 

4.74 

6.15 

-1.41 

66.4 

2.65 

4.19 

2.86 
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Make  Unoccujiied  Beds 
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6.29 

3.11 
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Serve  Meal  Trays 

3.63 

5.40 
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3.44 

5.07 

3.14 

REPORT  OF  STEERING  COMMITTEE 


& 

GENERAL  BUSINESS  MEETING  (1977) 


1.  HARRY  H.  GREER  AWARD: 

Steering  Committee  approved  presentation  of  the  swards  to  Dr. 

William  Moonan,  Naval  Personnel  and  Research  Development  Command , 
and  to  John  A.  Burt,  U.  S.  Coast  Guard  Institute.  (Texts  are  atch). 

2.  ARTICLE  III  OF  THE  BY-LAW* 

The  Steering  Committee  voted  not  to  change  the  wording  of  Article  III. 
This  article  deals  with  membership. 

3.  ARTICLE  VII  OF  THE  BY-LAWS 

The  Steering  Committee  and  the  General  Membership  approved  a  change 
of  Section  B  of  the  By-Laws  from  the  present  wording  of: 

B.  The  annual  Conference  of  the  association  shall  be  coordinated 
by  the  agencies  of  the  associated  armed  services  exercising  primary 
responsibility  for  military  personnel  assessmsf';  in  order  of  the 
following  rotating  schedule: 

United  States  Army 
United  States  Marine  Corps 
United  States  Navy 
United  States  Air  Force 
United  States  Coast  Guard; 

Hence t or th  to  read: 

B.  The  annual  conference  of  the  Aascciation  shall  be  coordinated 
by  the  agencies  of  the  associated  anted  services  exercising  primary 
responsibility  for  military  personnel  assessment.  The  coordinating 
agencies  and  the  order  of  rotation  will  be  determined  annually  by  the 
Steering  Cosmlttee.  The  coordinating  agencies  for  at  least  the 
following  three  years  will  be  announced  at  the  annual  meeting. 

4.  COORDINATING  AGENCIES  1978-1982 

Ir.  conformance  with  Article  VII,  Section  B  the  hosting  sites  for  the 
above  periods  are: 
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1978  Oklahoma  City  -  Coast  Guard  Institute 

1979  San  Diego  -  Navy  Personnel  Research  and 
Development  Center 

1980  Toronto,  Canada  -  Canadian  Forces  Personnel. 
Applied  Research  Unit 

1981  Ft  Eustis,  Virginia  -  U.  S.  Army 

1982  Pennsacola,  Florida  -  U.  S,  Navy,  Program  Develop¬ 
ment  Center 
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MILITARY  TESTING  ASSOCIATION 
HARRY  F.  GREER  AWAR> 

TO 

JOHN  A.  BURT 

To  you,  John  A.  Burt,  the  Military  Testing  Association  owes  its  current 
level  of  prominence  and  excellence.  Your  personal  efforts  over  the  last 
ten  year?  have  helped  insure  the  continuance  of  the  MTA  as  a  functioning 
and  growing  organisation,  and  have  helped  produce  interesting,  informative, 
and  Important  conferences  over  the  years.  Your  continuing  Involvement  in 
the  activities  of  the  steering  committee  and  the  overall  management  of  the 
MTA  have  been  appreciated.  We  all  realize  that  much  of  this  service  to 
the  MTA  has  been  accomplished  at  the  cost  of  difficult  and  time-consuming 
personal  effort. 

This  award  for  outstanding  service  is  most  appropriately  given  in  the  name 
of  Harry  F.  Greer,  as  you  have  exemplified  his  aims  and  have  carried  out 
his  intentions  in  forming  the  MTA.  This  awatd  is  made  with  the  gratitude, 
friendship,  and  regard  of  all  associated  with  the  Military  Testing 


Association. 


MILITARY  TESTING  ASSOCIATION 
HARRY  F.  GREER  AWARD 
TO 

DR.  WIT  LIAM  MOONAN 


The  Harry  Greer  Award  is  herebv  presented  to  Dr.  William  Moonan  of  the 
Naval  Personnel  Research  and  Development  Center  for  your  consistent 
and  lasting  contributions  to  the  purposes  of  the  Military  Testing 
Association. 

Year  after  year,  your  dedication  to  the  scientific  principles  which 
underlie  the  assessment  of  individuals  has  inspired  each  of  us  while 
the  integrity  and  innovation  of  your  work  has  set  an  example  for  all. 
Your  original  contributions,  in  the  statistical  treatment  of  assessment 
data  are  numerous. 

The  prodigous  volume  and  inventive  character  of  your  work  is  of  credit 
to  youruelf,  to  the  Navy  and  to  the  entire  community  of  military 
personnel  assessment.  For  this  we  thank  you.  Therefore  this  award  is 
made  with  gratitude,  friendship  and  regard  of  all  associated  with  the 
Military  Testing  Association. 
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BY-LAWS  OF  THE  MILITARY  TESTING  ASSOCIATION* 


Article  I  -  Name 

The  name  of  this  organization  shall  be  the  Military  Testing 
Association. 


Article  II  -  Purpose 

The  purpose  of  this  Association  shall  be  to: 

A.  Assemble  representatives  of  the  various  armed  service?  of  the 
United  States  and  such  other  nations  as  might  request  to  discuss  and 
exchange  ideas  concerning  assessment  of  military  personnel. 

B.  Review,  study,  and  discuss  the  mission,  organization,  operations, 
and  research  activities  of  the  various  associated  organizations  engaged  in 
military  personnel  assessment. 

C.  Foster  improved  personnel  assessment  through  exploration  and 
presentation  of  new  techniques  and  procedures  for  behavioral  measurement, 
occupational  analysis,  manpower  analysis,  simulation  models,  training 
programs,  selection  methodology,  survey  and  feedback  systems. 

D.  Promote  cooperation  in  the  exchange  of  assessment  procedures, 
techniques  and  instruments. 

E.  Promote  the  assessment  of  military  personnel  as  a  scientific 
adjunct  to  modern  military  personnel  management  within  the  military  and 
professional  communities. 


Article  III  -  Participation 

The  following  categories  ?)hall  constitute  mtabership  within  the  MTA: 

A.  Primary  Membership. 

1.  Ail  active  duty  military  and  civilian  personnel  permanently 
assigned  to  an  agency  of  the  associated  armed  services  having  primary 
responsibility  for  assessment  for  personnel  systems. 

2.  All  civilian  and  active  duty  military  personnel  permanently 
assigned  to  an  organization  exercising  direct  command  over  an  agency  of 
the  associated  armed  services  holding  primary  responsibility  for  assessment 
of  military  personnel. 


*  As  approved  at  the  1977  General  Meeting  of  The  Association,  21  Oct  77, 
San  Antonio,  Texas 


1426 


*"&t dfypxbw^-tt 


B.  Associate  Membership. 

1.  Membership  in  this  category  will  be  extended  to  permanent 
personnel  of  various  governmental,  educational,  business,  industrial  and 
private  organisations  engaged  in  activities  that  parnllel  those  of  the 
primary  membership.  Associate  members  shall  be  entitled  to  all  privileges 
of  primary  members  with  the  exception  of  membership  on  the  Steering 
Committee.  This  restriction  may  be  waived  by  the  majority  vote  of  the 
Steering  Committee. 


Article  IV  -  Dues 

No  annual  dues  shall  be  levied  against  the  participants, 


Article  V  -  Steering  Committee 

A.  The  governing  body  of  the  Association  shall  be  the  Steering 
Committee.  The  Steering  Committee  shall  consist  of  voting  and  non-voting 
members.  Voting  members  are  primary  members  of  the  Steering  Committee. 
Primary  membership  shall  Include: 

1.  The  Commanding  Officers  of  the  respective  agencies  of  the 
armed  services  exercising  responsibility  for  personnel  assessment  programs. 

2.  The  ranking  civilian  professional  employees  of  the  respective 
agencies  of  the  urmed  service  exercising  primary  responsibility  for  the 
conduct  of  personnel  assessment  Rystmras,  Each  agency  shall  have  no  more 
than  two  (2)  professional  civilian  representatives. 

B.  Associate  membership  of  the  Steering  Committee  shall  be  extended 

by  majority  vote  of  the  committee  to  representatives  of  various  governmental, 
educational,  business,  industrial  and  private  organizations  whose  purposes 
parallel  those  of  the  Association. 

C.  The  Chairman  of  the  Steering  Committee  shall  be  appointed  by  the 
President  of  the  Association.  The  term  of  office  shall  be  one  year  and 
shall  begin  the  last  day  of  the  annual  conference. 

D.  The  Steering  Committee  shall  have  general  supervision  over  the 
affairs  of  the  Association  and  shall  have  the  responsibility  for  all 
activities  of  the  Association.  The  Steering  Committee  shall  conduct  the 
business  of  the  Association  in  the  interim  between  annual  conferences  of 
the  Association  by  such  means  of  communication  as  deemed  appropriate  by 
the  President  or  Chairman, 

E.  Meeting  of  the  Steering  Committee  shall  be  held  during  the  annual 
conferences  of  the  Association  and  at  such  times  as  requested  by  the 
President  of  the  Association  or  the  Chairman  of  the  Steering  Committee.  A 
majority  of  the  members  of  the  Steering  Committee  shall  constitute  a  quorum. 
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Article  VI  -  Officers 

A.  The  officers  of  the  Association  shall  consist  of  a  President, 
Chairman  of  the  Steering  Conmittee  and  a  Secretary. 

B.  The  President  of  the  Association  shall  be  the  Commanding  Officer 
of  the  armed  services  agency  coordinating  the  annual  conference  of  the 
Association.  The  term  ot  the  President  shall  begin  at  the  close  of  the 
annual  conference  of  the  Association  and  shall  expire  at  the  close  of  the 
next  annual  conference. 

C.  It  shall  be  the  duty  of  the  President  to  organize  and  coordinate 
the  annual  conference  of  the  Association  held  during  his  term  of  office, 
and  to  perform  the  customary  duties  of  a  president. 

D.  The  Secretary  of  the  Association  shall  be  filled  through  appoint** 
ment  by  the  President  of  the  Association.  The  term  of  office  of  the 
Secretary  shall  be  the  same  as  that  of  the  President. 

E.  It  shall  be  the  duty  of  the  Secretary  of  the  Association  to  keep 
the  records  of  the  association,  and  the  Steering  Committee,  end  to 
conduct  official  correspondence  of  the  association,  and  to  insure  notices 
for  conferences.  The  Secretary  shall  also  perform  such  additional  duties 
and  take  such  additional  responsibilities  as  the  President  may  delegate 
to  him. 


Article  VII  -  Meetings 

A.  The  Association  shall  hold  a  conference,  annually. 

*  B,  The  annual  conferet.ce  of  the  Association  shall  be  coordinated  by 

the  agencies  of  the  associated  armed  services  exercising  primary  responsi¬ 
bility  for  military  personnel  assessment.  The  coordinating  agencies  and 
the  order  of  rotation  will  be  determined  annually  by  the  Steering  Committee. 
The  coordinating  agencies  for  at  least  the  following  three  years  will  be 
announced  at  the  annual  meeting. 

C.  The  annual  conference  of  the  Association  shall  be  held  at  a  time 
and  place  determined  by  the  coordinating  agency.  Tne  membership  of  the 
association  shall  be  informed  at  the  annual  conference  of  the  place  at 
which  the  following  annual  conference  will  be  held.  The  coordinating 
agency  shall  inform  the  Steering  Committee  of  the  time  of  the  annual 
conference  not  less  than  six  (6)  months  prior  to  the  conference. 

D.  The  coordinating  agency  shall  exercise  planning  and  supervision 
over  the  program  of  the  annual  conference.  Final  selection  of  program 
content  shall  be  the  responsibility  of  the  coordinating  organization. 
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E.  Any  other  organization  desiring  to  coordinate  the  conference  may 
submit  a  formal  request  to  the  Chairman  of  the  Steering  Committee,  no 
later  than  18  months  prior  to  the  date  they  wish  to  serve  as  host. 


Article  VIII  -  Conmtittees 

A.  Standing  committees  may  be  named  from  time  to  time,  as  required, 
by  vote  of  the  Steering  Committee.  The  chairman  of  each  standing  committee 
shall  be  appointed  by  the  Chairman  of  the  Steering  Committee.  Members  of 
standing  committees  shall  be  appointed  by  Che  Chairman  of  the  Steering 
Committee  in  consultation  with  the  Chairman  of  the  committee  in  question. 
Chairmen  and  committee  members  shall  serve  in  their  appointed  capacities 

at  the  discretion  of  the  Chairman  of  the  Steering  Committee.  The  Chairman 
of  the  Steering  Committee  shall  be  ex  officio  member  of  all  standing 
committees . 

B.  The  President  with  the  counsel  and  approval  of  the  Steering 

Committee  may  appoint  such  ad  hoc  committees  as  are  needed  from  time  to 
time.  An  ad  hoc  committee  shall  serve  until  its  assigned  task  is 

completed  or  for  the  length  of  time  specified  by  the  President  in  consul¬ 

tation  with  the  Steering  Committee. 

C.  All  standing  committees  shall  clear  their  general  plans  of  action 
and  new  policies  through  the  Steering  Committee,  and  no  committee  or 
committee  chairman  shall  enter  into  relationships  or  activities  with 
persons  or  groups  outside  of  the  Asoociation  that  extend  beyond  the 
approved  general  plan  of  work  without  the  specific  authoriztion  of  the 
Steering  Committee. 

D.  In  the  inteiest  of  continuity,  if  ony  officer  or  member  has  any 

duty  elected  or  appointed  placed  on  him,  ana  is  unable  to  perform  the 
designated  duty,  he  should  decline  and  notify  at  once  the  officers  of  the 

association  that  he  cannot  accept  or  continue  said  duty. 


Article  IX  -  Amendments 

A  Amendments  of  these  By-Laws  may  be  made  at  any  annual  conference 
of  d'< '  Association. 

B.  Amendments  of  the  By-Laws  may  be  made  by  majority  vote  of  the 
assembled  membership  of  the  Association  provided  that  ths  proposed  amend¬ 
ments  shall  have  been  approved  by  a  majority  vote  of  the  Steering  Committee. 

C.  Proposed  amendments  not  approved  by  a  majority  vote  of  the 
Steering  ''.ommittee  shall  require  a  two-third's  vote  of  the  assembled 
membership  of  the  association. 
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Article  X  -  Voting 

All  members  in  attendance  shall  be  voting  members  * 


Article  XI  -  Enactment 

These  By-Laws  shall  be  in  force  immediately  upon  acceptance  by  a 
majority  of  the  assembled  membership  of  the  Association  and/or  amended 
(in  force  2  November  1973) . 


ATTENDEES 


ADAMS,  William,  Jr. 

Chief  of  Naval  Educ  &  Training 
Naval  Ur  Station 
Pennaeola,  R  32508 

ALBEOC,  Robert  1) 

Naval  Tech  Trng  Center 
Corry  Station 
Pensacola,  FL  32511 

ALBERT,  Walter  G 
APHRL/SM 

Brooks  AFB,  TX  78235 

ALLEY,  William  E 
AFHRL/OEE 

Brooks  AFB,  TX  78235 

AMMERMAN,  Harry  L,  Dr 
Center  for  Vocational  Educ 
1960  Kenny  Road 
Columbus,  Ohio  43210 

ANDERSON,  H.  R. 

USCG  AVTRACEN 
Bates  Field, 

Mobile,  AL  36608 

ARCHER,  Wayne  B 
AFHRL/ORA 

Brooka  AFB,  TX  78235 

MIMA,  James  K 
3115  Martin  Rd 
Carmel  CA  93921 

ASA-DCaiAN,  Paul  V,  Sr 

Fleet  *intl-Subm*rlne  Ware  fare  Tng  Cen 

San  Diego,  CA  92147 

AVERSANO,  Frank  M.  Dr 
US  Army  Trng  Support  Center 
Ft  Euatis,  VA  23604 
ATTN:  ATTSC-TI-TD 

AVERY,  Doris  E 
Army  Educ  Center 
P.0.  Box  3533 
Ft  Wainvright,  Alaska 


BACHTEL,  Mary  A 

Naval  Educ  &  Trng  Progr  Deve  Cntr 
Ellyson  PD-9 
Pensacola,  FL  32509 

BAISDEN,  Annette 
Psychology  Dept 

Naval  Aerospace  Med  Research  Lab 
Pensacola,  FL  32508 

BAKER,  J.  G.  Capt 

Chief  of  Naval  Educ  &  Tmg/N-8 

NAS,  Pensacola,  FL  32508 

BARAN,  Harry  A. 

AFHRL/ASR 

Wri gh t-Pat terson  AFB,  OH  45433 

BARE FIELD,  Ruth  B 
USAAVNC 

Ft  Rucker,  AL  36362 

BARLOW,  Bruce  H,  Capt 
Directorate  of  Tng  Dev 
US  Army  Inf  School 
Ft  Banning,  GA  31905 

BAYMOR,  Michael 
Directorate  of  Evaluation 
USAARMS 

Ft  Knox,  KY  40121 

BECK,  Vera  J 
Education  Center  -  DPCA 
Ft  Ord,  CA 

BEGLAND,  Bob  Capt 
3033  Rtillwood  Ct 
Tallahassee,  FL  32303 

BELISLE,  Michael 

3507  ACS 

119  Roblnhood 

San  Antoni^,  TX  78209 

BELL,  M.  Herman,  LTC 
Directorate  of  Tng  Dev 
US  Army  Inf  School 
Ft  Benning,  GA  31905 
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BELLANTONI ,  Maria,  Dr 
US  Army  Trng  Dev  Inst 
Bldg  1514 

Ft  Rustic,  VA  23604 

BERCKR,  B.  Michael 
USA  KIL  PER  CEN 
2461  Eisenhower  Ave 
Alexandria,  VA  22331 

BERGER,  Daniel  L,  Sr  A 
APHRL/SM 

Brooks  AFB,  TX  78235 

BKRGMANN,  Joe  A 
AFHRL/ORA 

Brooks  AFB,  TX  78235 

BIALSK,  Hilton  M,  Dr 
Human  Resources  Rsch  Organ  (HUMRRO) 
27857  Berwick  Dr 
Carmel,  CA  93923 

BIEDIGER,  Rosalie  A 
AFHRL/SM 

Brooks  AFB,  TX  78235 

BIRDALL,  Walter  A 

Naval  Educ  4  Trng  Program  Dev  Cen 

Ellyson,  PD 

Pensacola,  FL  32509 

BLACK,  Doris  E 
AFHRL/SM 

Brooks  AFB,  TX  78235 

BLAND,  Raymond  D,  CMDR 
USOG  Training  Center 
Cape  May,  NJ 

B0LD0VICI ,  John  A,  Dr 
HUMRRO 
P0  Bor  293 
Ft  Knox,  KY  40121 


BORTNER,  Don  E,  SrA 
AFHRL/NA 

Brooks  AFB,  TX  78235 

BOTTENBERG,  Robert  A,  Dr 
AFHRL/SM 

Brooks  AFB,  TX  78235 

BOUZA,  Duane  J 
Base  Educ  Cen  (DPT) 

Offutt  AFB,  NE  65113 

BOYD,  Joseph  L,  Jr 
Educational  Testing  Service 
10  Pine  Knoll  Dr 
Lawrencevllle,  NJ  08648 

BOYD,  Richard  D 
KILPERCEN  USA 
Alexandria,  VA  22314 

BRAND,  John  S 

Directorate  of  Evalrstlon 

US  Army  ADMINf.EW 

Ft  Benjamin  Kerri son ,  Ind  46249 

BROKAW,  Leland  D,  Dr 
AFHRL/PE 
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